Remix.run Logo
andai 10 hours ago

Opus 4.8 beats Sonnet 5 on the pareto frontier in several of their graphs (Agentic Search, Agentic Computer Use).

In other words, for certain tasks, Opus 4.8 is cheaper than Sonnet 5, and does better than Sonnet 5.

I've noticed this pattern on a lot of benchmarks. You can try to emulate a bigger model by ramping up the test time compute (max reasoning, more turns, model fusion etc.), but you can't reach the same quality level, and you often exceed the cost you would have paid by just using a bigger model.

tldr: if you're doing something hard, just use a bigger model.

copperx 10 hours ago | parent [-]

And Claude Code penalizes you for using Sonnet on the subscription plan, so there's little reason to use it.

bredren 9 hours ago | parent | next [-]

This is what I realized, can you provide more detail on how you've observed this? The /usage screen does not make it clear.

MillionOClock 9 hours ago | parent [-]

Not the original commenter, but personally I noticed my quota usage didn’t feel like it was being spent at a much lower rate when using Sonnet even on a relatively low thinking budget and based on a few comments here it seems I might not be the only one. Has anyone else noticed this? Wasn’t it different in the past? I thought I would be getting to use Sonnet much much more than Opus but it did not feel that way despite being on 20x plan.

gverrilla 9 hours ago | parent | prev [-]

How so?