For folks interested in some of the nuances of this benchmark, I just posted this deep dive:
https://blog.sshh.io/p/understanding-ai-benchmarks