MathNet:30k competition math problems for AI mathematical reasoning benchmarking

	▲	MathNet:30k competition math problems for AI mathematical reasoning benchmarking(mathnet.mit.edu)
		5 points by nill0 a day ago \| 2 comments

	▲	nill0 a day ago \| parent \| next [-]
		Relevant article: https://news.mit.edu/2026/mit-scientists-build-worlds-larges...
	▲	LeCompteSftware 19 hours ago \| parent \| prev [-]
		Hmm I already found a typo in one of the solutions. I believe this scraped from a bunch of PDFs in an unaudited automated process, so of course there are going to be some problems. But a) It doesn't bode well that I poked at three problems and already found an issue. b) Even if it took 50 problems before my sampling paid off, there are 30,000 things to review here. I am not sure anyone actually took responsibility for even reading it, let alone making sure it was correct. I am getting tired doing basic sanity-checking on this stuff. Maybe I just got extremely unlucky and found one of the 300 problems with a typo. But I have been feeling awfully dejected at seeing so much garbage vibe code this year, and am not feeling particularly charitable to this. If volunteer QA can find a problem with 5 minutes of not particularly close reading, then it doesn't seem like this is ready for release.