Remix.run Logo
LoganDark 13 hours ago

That's only the fault of particular benchmarks, and that's also why it's important to offer the outputs in question that resulted in a particular score. I'm not sure that all or even most benchmarks do this, but it's important when selecting a model.