Remix.run Logo
boh a day ago

Can't wait to hear how it breaks all the benchmarks but have any differences be entirely imperceivable in practice.

jackdeansmith a day ago | parent [-]

In my opinion most Anthropic models are the opposite, scoring well on benchmarks but not always way on top, but quietly excellent when you actually try to use them for stuff.