> and will judge, like any sane person, that US frontier models have stopped earning their multiplier

I think that this is on the money, although I'd place the bar even lower - DeepSeek v4 Flash is sufficient for basically all day-to-day coding tasks.

You might want something beefier for a complicated reverse-engineering project, but it will competently one-shot a decently complicated app or API - and a $10/month OpenCode Go subscription is sufficient to keep you in tokens for such a cost-efficient model...

Similarly, my employer hands us all Cursor, I've yet to actually switch it out of "auto" mode, which mostly runs Composer (their in-house finetune of Kimi 2.5).

▲

realmofthemad 2 hours ago | parent | next [-]

Am I missing out? I feel like I can definitely tell the difference in quality between Claude Opus and other smaller models. The smaller models are much more likely to make mistakes or to get stuck on random stuff

Maybe I just haven't been trying the right models?

▲

xyzal 3 hours ago | parent | prev | next [-]

I'll root for DeepSeek v4 Flash as well. It surprised me just how "good enough" it is for most of my needs, and also dirt cheap. Everyone should try it at least once.

	▲	MaKey 3 hours ago \| parent [-]
		+1, it's good enough for what I need to do as a DevOps engineer.

▲

sublinear 2 hours ago | parent | prev [-]

I think the situation is even more severely ridiculous than that. Google is still good enough just like it was well over a decade ago.

Most people don't have workloads that demand agentic workflows to begin with, and if their employer is pushing for that it's probably a startup that underpays or a coding sweatshop full of nepotism that fires fast.