Remix.run Logo
VBprogrammer 2 hours ago

> Can there be bugs? Sure. That's the price of not reading or understanding every line.

I've yet to come across a human developer who's output would meet this standard, despite writing every line.

In fact, having an LLM review our code is catching quite a few bugs before it reaches QA.

ben_w 2 hours ago | parent [-]

Indeed, though I find the distribution is different.

The humans may skip unit tests and need reminding; the AI always write unit tests once it's in AGENTS.md or whatever, but my experience* was that 5-10% of the time the LLM's attempt at a "test" would, instead of executing the code and examining the results, open the source code as a text file and run a regex to find/exclude certain substrings.

* At the start of this year, because Anthropic and OpenAI were both offering free trials. IDK how much things have changed since then, some things change fast in this domain, other things don't.

dezgeg 14 minutes ago | parent | next [-]

I have seen some pre-AI over-mocked codebases where the "tests" where essentially that (but harder to read than regex would have been)

baq an hour ago | parent | prev [-]

I’ve been piloting LLMs for the past six months non stop and we’re at the point where formally verified models generated as an intermediate step between spec and code are very good value.

Riding the exponential means you have to update priors more often.