> so what may be happening is that bosses see that output is at 80% (productivity down!)

If an initiative produces only 80% of the previous results and you’re paying large token bills on top of the same wages, the AI is going to get cut off.

> i've seen a number of articles claiming things like "devs self report they'er +x% more productive with AI, but actually they're -y% LESS efficient!".

Are you thinking of the old METR evals? Their more recent evals showed an actual performance improvement.

The old report is still circulated as bait for AI skeptics.

▲

raini 3 hours ago | parent [-]

I think the old report you're referencing is this [1] from July 2025, but I can't find a new report. This [2] links to a new dataset at the bottom (that maybe shows improvements?) but it seems like they chose not to write it up because of perceived flaws in their study. Is there a more relevant report I'm missing?

[1]: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...

[2]: https://metr.org/blog/2026-02-24-uplift-update/#wider-adopti...

	▲	oudlys 3 hours ago \| parent [-]
		I read this today and found it super valuable in evaluating METRs research. https://arachnemag.substack.com/p/the-metr-graph-is-hot-garb...