Remix.run Logo
theptip a day ago

No, they are not getting worse. Again, look at METR task times.

The peak capability is very obviously, and objectively, increasing.

The scaffolding you need to elicit top performance changes each generation. I feel it’s less scaffolding now to get good results. (Lots of the “scaffolding” these days is less “contrived AI prompt engineering” and more “well understood software engineering best practices”.)