Remix.run Logo
benjiro29 10 hours ago

Anybody notice that they did not include Sonnet 5 Max in the "Agentic Search results", when comparing to Opus 4.8 ...

Based upon the "Agentic Computer usage", Sonnet 5 Max was going to be off "Agentic Search results" chart. lol ...

In short, Sonnet 5 Low/Medium is more cost efficient, if its a task below Opus 4.8 Medium. For the rest its expensive and your better off using Opus 4.8.

Why even release this model?

ricardobeat 9 hours ago | parent | next [-]

Because it’s a massive improvement over the previous model, and cheaper?

You are reading too much into the graph and ignoring the threshold of usefulness for real world tasks. By that logic Sonnet 4.5 would have never been worth using.

benjiro29 9 hours ago | parent [-]

Am i missing something? Because your making my point. Its only worth it compared to Opus 4.8, if the tasks your running requires Opus 4.8 low (or non-existing lower).

For the rest the gap in pricing vs efficiency is so small, that there is no point in using Sonnet. I am looking at their own cost comparisons vs efficiency...

ricardobeat 8 hours ago | parent [-]

The point is that Sonnet at medium or even low will be smart enough for most daily tasks. You’re defining “worth using” as if you always need the highest performance possible, which is what these benchmarks measure, but most work doesn’t need it. You’ll pay more to get the same result. Sonnet 4.5 is very popular as a main model currently, this is a free upgrade.

I use Haiku a lot for agent workflows, if I can get better output at similar prices, Sonnet 5 will replace it completely.

bredren 9 hours ago | parent | prev [-]

I'd narrow that to why even allow the harness to run `high` on this model?