Remix.run Logo
artursapek 4 hours ago

They claim extreme performance on ExploitBench, which Mythos was touted as being incredible at. https://x.com/OpenAI/status/2070555278576439306

HarHarVeryFunny 2 hours ago | parent | next [-]

My guess is that it's same base model as 5.5, but with additional post-training to improve and benchmaxx on a few things like that.

If they really thought it was competitive with Mythos/Fable across the board, then why wouldn't they release a broader set of benchmarks, and why price it day 1 at 1/2 the cost of Fable?

andriy_koval 3 hours ago | parent | prev [-]

On graph, they are still slightly bellow Mythos. Maybe enough to not be prohibited by US government?