Remix.run Logo
redox99 5 hours ago

> If it slightly beats or even matches Opus 4.6

It doesn't though

ryeguy_24 4 hours ago | parent [-]

Curious on why you think this. Any data points that led you to this?

howdareme 4 hours ago | parent [-]

The benchmarks they released

johnfn 2 hours ago | parent [-]

What do you mean? In most cases, the benchmarks show a larger number for Muse and a smaller number for Opus.

spprashant 2 hours ago | parent [-]

In Multimodal yes, but Opus is definitely edging out in Text/Reasoning and Agentic benchmarks.

I think the general skepticism is because they are late to race, and they are releasing a Opus-4.6-equivalent model now, when Anthropic is teasing Mythos.