| ▲ | Preliminary data from a longitudinal AI impact study(newsletter.getdx.com) | ||||||||||||||||||||||||||||
| 36 points by donutshop 6 hours ago | 27 comments | |||||||||||||||||||||||||||||
| ▲ | SirensOfTitan 3 hours ago | parent | next [-] | ||||||||||||||||||||||||||||
This reads as incredibly damning to me. PR throughput should be a metric that is very supportive of the AI productivity narrative, but the effect is marginal. Before everyone gets at me: smoking cigarettes increases your risk of lung cancer by 15-30x. Effect size matters. As does margin of error: what is the margin of error? This "increase" could easily be within noise. PR throughput is also not a metric I would ever use to determine developer productivity for a paradigm shifting technology. I would only ever use it to compare like-to-like to find trailheads: is a team or person suddenly way more or less productive? The primary endpoint for software production is serving your customer or your mission, and PR throughput can't tell you whether any of that got better. It also cannot tell you the cost of your prior work: the increase in PR throughput could be more PRs to fix issues introduced by LLM-assisted work. | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
| ▲ | jwilliams 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
I wrote a short bit on a similar topic the other day[^a]. Just because something is faster or even measurably better, that doesn't translate to end productivity. 1. You might be speeding up something that is inherently not productive (the "faster horses" trope). I see companies using AI to generate performance reviews. Same company using AI to summarize all the new performance stuff they're getting. All that's happening is amplified busywork (there is real work in there, but questionable if it's improved). 2. Some things are zero sum. If you're not using AI for marketing you might fall behind. So you adopt these tools, but attention/etc are limited. There is no net gain, just competition. 3. You might speed one part up (typing code), but then other parts of your pipeline quickly become constraints. It might be a long time before we're able to adapt the end-to-end process. This is amplified by coding tools being three strides ahead. 4. Then there are actual productivity improvements. One of these PRs could have been "translate this to German". That could be one PR but a whole step-change for the business. So much of what is happening falls in buckets 1+2+3. I don't think we've really got into the meat of 4 yet. | |||||||||||||||||||||||||||||
| ▲ | 0xbadc0de5 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
Fair assessment. And worth noting that in a sane world, a broad 10% productivity improvement across industry would be a once-in-a-lifetime, headline-making story, not a disappointment. | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
| ▲ | rybosworld 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
> Planning, alignment, scoping, code review, and handoffs—the human parts of the SDLC—remain largely untouched Seems likely that process is holding things back. Planning has always been a "best-guess". There's lots you can't account for until you start a task. Code review mostly exists because the cost of doing something wrong was high (because human coding is slow). If you can code faster, you can replace bad code faster. I.e., LLMs have cheapened the cost of deployment. We can't honestly assess the new way of doing things when we bring along the baggage of the old way of doing things. | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
| ▲ | naasking 3 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
Sounds reasonable, but gains will go up. There is a ceiling somewhere, but we don't know where it is. | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
| ▲ | deterministic an hour ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
If you think a department or individual working 10% faster makes a company 10% more productive, you’re almost certainly wrong. Productivity only improves if the change increases revenue or reduces costs. And that rarely happens unless you improve the actual bottleneck of the organization. To understand why, I recommend the book The Goal: A Process of Ongoing Improvement by Eliyahu M. Goldratt and Jeff Cox. | |||||||||||||||||||||||||||||
| ▲ | nemo44x 2 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
At the very least teams will communicate with each other much better. So much of the tedium of office work is able to be automated so people can spend more time solving problems instead. But the communication will massively improve. More artifacts being generated of progress and needs and AI can link related things around an organization rapidly and accurately. Workflows will massively improve. A living graph of an entire organization will come to life. I think more productivity gains will come from this automation than anything. People will look back at all the drudgery workers did. | |||||||||||||||||||||||||||||
| ▲ | enraged_camel 4 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
>> November 2024 through February 2026 Yeah, listen... I'm glad these types of studies are being conducted. I'll say this though: the difference between pre- and post-Opus 4.5 has been night and day for me. From August 2025 through November 2025 I led a complex project at work where I used Sonnet 4.5 heavily. It was very helpful, but my total productivity gains were around 10-15%, which is pretty much what the study found. Once Opus came out in November though, it was like someone flipped a switch. It was much more capable at autonomous work and required way less hand-holding, intervention or course-correction. 4.6 has been even better. So I'm much more interested in reading studies like this over the next two years where the start period coincides with Opus 4.5's release. | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
| ▲ | jongjong 3 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
As I've said before, AI is a force multiplier. A 10x developer is now a 100x developer and a -10x developer (complexity maker/value destroyer) is now a -100x developer. I can understand why a lot of companies are cutting junior roles. What AI does is it automates most of the stuff that juniors are good at (coding fast) but not much of the stuff that the seniors are good at. That said, I've worked with some juniors who managed to navigate; they do this by focusing on higher order thinking and developing a sense of what's important by interacting with senior engineers. Unfortunately, it raises the talent bar for juniors; they have to become more intelligent; not in a puzzle-solving way, but in a more architectural big-picture sort of way; almost like entrepreneurial thinking but more detailed/complex. LLMs don't have a worldview; this means that they miss a lot of inconsistencies and logical contradictions. Also, most critically, LLMs don't know what's important (at least not accurately enough) so they can't prioritize effectively and they make a lot of bad decisions. It's kind of interesting for me because a lot of the areas where I had a contrarian opinion in the field of software development, I now see LLMs getting trapped into those and getting bad results. It's like all my contrarian opinions became much more valuable. | |||||||||||||||||||||||||||||
| ▲ | arisAlexis 5 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
because the human may be the bottleneck soon | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
| ▲ | verdverm 6 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||
so far, we're still learning how to use this new tool, which is also getting better with each release | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||