| ▲ | artursapek 4 hours ago | |
They claim extreme performance on ExploitBench, which Mythos was touted as being incredible at. https://x.com/OpenAI/status/2070555278576439306 | ||
| ▲ | HarHarVeryFunny 2 hours ago | parent | next [-] | |
My guess is that it's same base model as 5.5, but with additional post-training to improve and benchmaxx on a few things like that. If they really thought it was competitive with Mythos/Fable across the board, then why wouldn't they release a broader set of benchmarks, and why price it day 1 at 1/2 the cost of Fable? | ||
| ▲ | andriy_koval 3 hours ago | parent | prev [-] | |
On graph, they are still slightly bellow Mythos. Maybe enough to not be prohibited by US government? | ||