"Published benchmarks are gamed, optimized, and overfit, and no longer yield a useful signal."
Is this true?
But I love this concept!
Oh very true. Benchmaxxing itself is basically gaming them.