| ▲ | MathNet:30k competition math problems for AI mathematical reasoning benchmarking(mathnet.mit.edu) | |
| 5 points by nill0 a day ago | 2 comments | ||
| ▲ | nill0 a day ago | parent | next [-] | |
Relevant article: https://news.mit.edu/2026/mit-scientists-build-worlds-larges... | ||
| ▲ | LeCompteSftware 19 hours ago | parent | prev [-] | |
Hmm I already found a typo in one of the solutions. I believe this scraped from a bunch of PDFs in an unaudited automated process, so of course there are going to be some problems. But a) It doesn't bode well that I poked at three problems and already found an issue. b) Even if it took 50 problems before my sampling paid off, there are 30,000 things to review here. I am not sure anyone actually took responsibility for even reading it, let alone making sure it was correct. I am getting tired doing basic sanity-checking on this stuff. Maybe I just got extremely unlucky and found one of the 300 problems with a typo. But I have been feeling awfully dejected at seeing so much garbage vibe code this year, and am not feeling particularly charitable to this. If volunteer QA can find a problem with 5 minutes of not particularly close reading, then it doesn't seem like this is ready for release. | ||