Remix.run Logo
keeda a day ago

Actually, quite the opposite. It seems any positive comment about AI coding gets at least one response along the lines of "Oh yeah, show me proof" or "Where is the deluge of vibe-coded apps?"

For my part, I point out there are a significant number of studies showing clear productivity boosts in coding, but those threads typically devolve to "How can they prove anything when we don't even know how to measure developer productivity?" (The better studies address this question and tackle it well-designed statistical methods such as randomly controlled trials.)

Also, there are some pretty large Github repos out there that are mostly vibe-coded. Like, Steve Yegge got to something like 350 thousand LoC in 6 weeks on Beads. I've not looked at it closely, but the commit history is there for anyone to see: https://github.com/steveyegge/beads/commits/main/

reppap a day ago | parent | next [-]

That seems like a lot more code than a tool like that should require.

keeda a day ago | parent [-]

It does, but I have no mental model of what would be required to efficiently coordinate a bunch of independently operating agents, so it's hard to make a judgement.

Also about half of it seems to be tests. It even has performance benchmarks, which are always an distant afterthought for anything other than infrastructure code in the hottest of loops! https://github.com/steveyegge/beads/blob/main/BENCHMARKS.md

This is one of the defining characteristics of vibe-coded projects: Extensive tests. That's what keeps the LLMs honest.

I had commented previously (https://news.ycombinator.com/item?id=45729826) that the logical conclusion of AI coding will look very weird to us and I guess this is one glimpse of it.

Ianjit 19 hours ago | parent | prev | next [-]

Please provide links to the studies, I am genuinely curious. I have been looking for data but most studies I find showing an uplift are just looking at LOC or PRs, which of course is nonsense.

Meta measured a 6-12% uplift in productivity from adopting agentic coding. Thats paltry. A Stanford case study found that after accounting for buggy code that needed to be re-worked there may be no productivity uplift.

I haven't seen any study showing a genuine uplift after accounting for properly reviewing and fixing the AI generated code.

kbelder 8 hours ago | parent | next [-]

>Meta measured a 6-12% uplift in productivity from adopting agentic coding. Thats paltry.

That feels like the right ballpark. I would have estimated 10-20%. But I'd say that's not paltry at all. If it's a 10% boost, it's worth paying for. Not transformative, but worthwhile.

I compare it to moving from a single monitor to a multi-monitor setup, or getting a dev their preferred IDE.

keeda 7 hours ago | parent | prev [-]

I mention a few here: https://news.ycombinator.com/item?id=45379452

> ... just looking at LOC or PRs, which of course is nonsense.

That's basically a variation of "How can they prove anything when we don't even know how to measure developer productivity?" ;-)

And the answer is the same: robust statistical methods! For instance, amongst other things they compare the same developers over time doing regular day-job tasks with the same quality control processes (review etc.) in place, before and after being allowed to use AI. It's like an A/B test. Spreading across a large N and time duration accounts for a lot of the day-to-day variation.

Note that they do not claim to measure individual or team productivity, but they do find a large, statistically significant difference in the data. Worth reading the methodologies to assuage any doubts.

> A Stanford case study found that after accounting for buggy code that needed to be re-worked there may be no productivity uplift.

I'm not sure if we're talking about the same Stanford study, the one in the link above (100K engineers across 600+ companies) does account for "code churn" (ostensibly fixing AI bugs) and still find an overall productivity boost in the 5 - 30% range. This depends a LOT on the use-case (e.g. complex tasks on legacy COBOL codebases actually see negative impact.)

In any case, most of these studies seem to agree on a 15 - 30% boost.

Note these are mostly from the ~2024 timeframe using the models from then without today's agentic coding harness. I would bet the number is much higher these days. More recent reports from sources like DX find upto a 60% increase in throughput, though I haven't looked closely at this and have some doubts.

> Meta measured a 6-12% uplift in productivity from adopting agentic coding. Thats paltry.

Even assuming a lower-end of 6% lift, at Meta SWE salaries that is a LOT of savings.

However, I haven't come across anything from Meta yet, could you link a source?

llmslave2 a day ago | parent | prev [-]

more code = better software

keeda a day ago | parent [-]

If the software has tens of thousands of users without expecting to get any at all, does the code even matter?

hackable_sand 20 hours ago | parent | next [-]

What?

llmslave2 a day ago | parent | prev [-]

Yeah

keeda 8 hours ago | parent [-]

Why?