Remix.run Logo
brokensegue 6 days ago

no? it's better on AIME '24, Multilingual MMLU, SWE-bench, Aider’s polyglot, MMMU, ComplexFuncBench

and it ties on a lot of benchmarks

asdev 6 days ago | parent [-]

look at all the graphs in the article

brokensegue 6 days ago | parent [-]

the data i posted all came from the graphs/charts in the article