Remix.run Logo
HardCodedBias 7 hours ago

Big if true.

I'll wait for the official blog with benchmark results.

I suspect that our ability to benchmark models is waning. Much more investment required in this area, but what is the play out?