I consider Opus 4.5 the crossover point where coding with agents got more efficient than not coding with agents. They were too stupid before that, and wasted more time than they saved for anything beyond a basic CRUD app or HTML page.
Certainly, the best models have gotten better since then, but I wouldn't consider DeepSeek V4 Pro or GLM 5.2 to be a big enough downgrade to be worse than coding by hand. I'm willing to spend a premium for the best model for coding because it wastes less of my time with dumb stuff, so I've got a Claude subscription. But, there is a limit to how much of a premium I'll pay. 10x over Chinese models? OK, fine. Opus saves me enough time to make it worth a couple hundred bucks a month. But, 100x, or more? Nah. I'll go a little slower, review the PRs a little more carefully.
And, open weights models do keep improving. DeepSeek V4 Pro is a notable improvement over earlier DeepSeek models, and the first DeepSeek model to cross the "better to work with it than without it" threshold into Opus 4.5 (or better) territory. GLM 5.2 is somewhere in the ballpark of Opus 4.6 (though without vision, a notable limitation for anything that requires a UI).