| ▲ | logicprog 2 hours ago | |
Re productivity: the METR study is seriously flawed overall, and: 1. if you disaggregate the highly aggregated data, it shows that the slowdown was highly dependent on task type, and tasks that required using documentation or novel tasks were possibly sped up, whereas ones the developers were very experienced with were slowed down, which actually matched the developers' own reports 2. developers were asked to estimate time beforehand per-task, but estimate whether they were sped up or slowed down only once, afterwards, so you're not really measuring the same thing 3. There were no rules about which AI to use, how to use it, or how much to use it, so it's hard to draw a clear conclusion 4. Most participants didn't have much experience with the AI tools they used (just prompting chatbots), and the one that did had a big productivity boost 5. It isn't an RCT. See [1] for all. The Anthropic study was using a task far too short to really measure productivity (30 mins), and furthermore the AI users were using chatbots, and spent the vast majority of their time manually retyping AI outputs, and if you ignore that time, AI users were 25% faster[2], so the study was not a good study to judge productivity, and the way people quote it is deeply misleading. Re learning: the Anthropic study shows that how you use AI massively changes whether you learn and how well you learn; some of the best scoring subjects in that study were ones who had the AI do the work for them, but then explain it afterward[3]. [1]: https://www.fightforthehuman.com/are-developers-slowed-down-... [2]: https://www.seangoedecke.com/how-does-ai-impact-skill-format... [3]: https://www.anthropic.com/research/AI-assistance-coding-skil... | ||