Remix.run Logo
hereme888 4 days ago

AI benchmarks are so strange and confusing for those outside of the field.

These "IQ" results are so different than metrics like GPQA, AIME, SWE Bench, etc.

https://artificialanalysis.ai/leaderboards/models