Remix.run Logo
js2 5 hours ago

Hah... yeah, no, its Python isn't great. I'd definitely workable and better than what I see from 9/10 junior engineers, but it tends to be pretty verbose and over-engineered.

My repos all have pre-commit hooks which run the linters/formatters/type-checkers. Both Claude and Gemini will sometimes write code that won't get past mypy and they'll then struggle to get it typed correct before eventually by passing the pre-commit check with `git commit -n`.

I've had to add some fairly specific instructions to CLAUDE.md/GEMINI.md to get them to cut this out.

Claude is better about following the rules. Gemini just flat out ignores instructions. I've also found Gemini is more likely to get stuck in a loop and give up.

That said, I'm saying this after about 100 hours of experience with these LLMs. I'm sure they'll get better with their output and I'll get better with my input.

physicsguy 2 hours ago | parent | next [-]

To be fair, depending on what libraries you’re using, Python typing isn’t exactly easy even for a human, I spend more time battling with type checkers and stubs than I would like.

hkt an hour ago | parent | prev [-]

I can confirm input matters a lot. I'm a couple of hundred hours ahead of you and my prompting has come along a lot. I recommend test cycles, prompts to reflect on product-implementation fit (eg, is this what you've been asked to do?) and lots of interactivity. Despite what I've written elsewhere in these comments, the best work is a good oneshot followed by small iterations and attentive steering.