Remix.run Logo
christianstump 2 hours ago

Let me also add: there is zero chance of the problems being included in the training data. The results are quite impressive: leading experts struggled to write questions with well-defined unique answers on existing research that the models were not able to solve.

This should not be interpreted as AI can solve mathematics: the ability to solve exercise-style questions based on existing research is vastly different from the creation of new mathematics.

But it is still impressive and not what we expected -- I rather expected that we end with 20-40 questions no current publicly available model can solve.