Remix.run Logo
torben-friis 7 hours ago

>A caveat: Lines of code is an imperfect measure, as it measures quantity over quality. So 8× lines of code/engineer/day in the second quarter of 2026 is almost certainly an overstatement of the true productivity gain. Nonetheless, it indicates an acceleration. At Anthropic, we don’t reward people for how many lines of code they write; rather, team members are producing more code simply because they’re using AI systems to write more code.

What about the hypothesis that AI is generating more verbose code? I just see the text pretending to acknowledge "LOC != Productivity" and then using it as a metric anyway.

malfist 5 hours ago | parent | next [-]

One of my co-workers just asked me to review his pull request that was all AI generated. 600 files were touched, over 40k lines of code added.

I'm sure he thought that was a crowning achievement, proof that AI can enable 10X developers, after all, what engineer could write 40k lines of code in a week?

I declined to review it, stating that I couldn't possibly vet 40k lines of code, and wouldn't put my reputation on the line to stamp the work as good. The PR nagged me for 2 weeks from my todo list and then disappeared. I don't know if he found another dev to get an approval from, or if the PR was abandoned. But I know for sure that him and I are on two totally separate islands around the value of LLMs.

fg137 3 hours ago | parent | next [-]

Same here. A co-worker touched a few hundred files in a PR and asked us to review. They merged it directly to main when nobody approved it. (The repo was not set up to enforce PR approval.)

I don't personally use that feature, and I couldn't care less at this point. If our customers are frustrated by the bugs, at least my name is not on it.

squidsoup 3 hours ago | parent | prev | next [-]

That's a process problem at your company - no developer should be proposing branches over 1k loc (or whatever your agreed tolerance threshold is) without a very good reason, vibe coded or not.

CamperBob2 3 hours ago | parent | prev [-]

I declined to review it, stating that I couldn't possibly vet 40k lines of code

Gee, that sounds like a job for Claude if there ever was one.

malfist 3 hours ago | parent [-]

You're absolutely right!

keeda 3 hours ago | parent | prev | next [-]

So the more rigorous studies about AI-assisted coding productivity addressed this by keeping in place all other software development processes, including the same code review and quality standards, and only measuring throughput (PRs, LoC) before and after AI was allowed.

Hence the intepretation of this 8x number depends on whether (or how much) Anthropic engineers have changed their quality standards and development processes. They don't tell us, and I am not aware of any other indications we could use to make a judgment.

However, we can still do some theorycrafting! I'm convinced that to fully realize the potential of AI-assisted coding we need to revamp all the dev processes, especially how we validate code, and it would be foolish of Anthropic not to do so (unless they were conducting a rigorous study, which they don't claim to have done.)

My hypothesis on the future of software validation is nothing fancy, we simply want much, much more automation for tests, observability and other bespoke verification methods than we traditionally had. But then validation code will also contribute to the LoC! My observation so far of personal as well as some "vibe-coded" open-source projects is O(LoC production code) ~= O(LoC test code). So as a SWAG the upper bound could be something like a 3 - 4x speedup, which is still remarkable.

All bets are off if code quality standards are not the same.

overgard 2 hours ago | parent | prev | next [-]

I just watched copilot today turn a 8 line fix into 500 lines, so, yeah, verbosity is a big side effect

verdverm 2 hours ago | parent [-]

It occurs to me this pattern might be the average code we humans have produced. We all have made those quick fixes, copy-pastas, and dirty hacks... they learned it somewhere! I also assume that some of the behavior is an artifact of their training regime.

TheRoque 41 minutes ago | parent [-]

So with LLM outputting average code, and people using LLM more and more, I guess the average code will become worse over time ?

nielsbot 30 minutes ago | parent [-]

Not advocating for AI code slop--but if AI coded software works correctly, maybe it doesn't matter? Except sometimes when a specialist will have to get involved. Not a perfect analogy, but most people don't write assembly these days--they have a compiler do that. Assembly still has a place, but it's a specialist task.

fooqux 5 hours ago | parent | prev | next [-]

Exactly. If AI is going to start being graded on how many LoC it generates- oh, I'm sorry, how much it "accelerates", than guess what newer models will start doing more of?

whateveracct 4 hours ago | parent | prev | next [-]

Yeah, they assume that "productivity = k * LOC" where k > 1

very flawed

chuckadams 5 hours ago | parent | prev | next [-]

AI generates code that mimics the existing code. If your code is terse and comment-free, then the agent’s code is too. The times I’ve seen Claude drift into a default “house style” it generated like 1 comment for every 10 LOC or so. It’s a far cry from the GPT-3 days that littered every line with the journals of Captain Obvious.

5 hours ago | parent | prev [-]
[deleted]