Remix.run Logo
Eufrat 13 hours ago

There was a post about Erdős 728 being solved with Harmonic’s Aristotle a little over a week ago [1] and that seemed like a good example of using state-of-the-art AI tech to help increase velocity in this space.

I’m not sure what this proves. I dumped a question into ChatGPT 5.2 and it produced a correct response after almost an hour [2]?

Okay? Is it repeatable? Why did it come up with this solution? How did it come up with the connections in its reasoning? I get that it looks correct and Tao’s approval definitely lends credibility that it is a valid solution, but what exactly is it that we’ve established here? That the corpus that ChatGPT 5.2 was trained on is better tuned for pure math?

I’m just confused what one is supposed to take away from this.

[1] https://news.ycombinator.com/item?id=46560445

[2] https://chatgpt.com/share/696ac45b-70d8-8003-9ca4-320151e081...

Coeur 3 hours ago | parent | next [-]

Also #124 was proved using AI 49 days ago: https://news.ycombinator.com/item?id=46094037

vessenes 6 hours ago | parent | prev [-]

Thanks for the curious question. This is one in a sequence of efforts to use LLMs to generate candidate proofs to open mathematical questions, which then are generally formalized into Lean, a formal proof system for pure mathematics.

Erdos was prolific and many of his open problems are numbered and have space to discuss them online, so it’s become fairly common to run through them with frontier models and see if a good proof can be come up with; there have been some notable successes here this year.

Tao seems to engage in sort of a two step approach with these proofs - first, are they correct? Lean formalization makes that unambiguous, but not all proofs are easily formulated into Lean, so he also just, you know, checks them. Second, literature search inside LLMs and out for prior results — this is to check where frontier models are at in the ‘novel proofs or just regurgitated proofs’ space.

To my knowledge, we’re currently at the point where we are seeing some novel proofs offered, but I don’t think we’ve seen any that have absolutely no priors in literature.

As you might guess this is itself sort of a Rorschach test for what AI could and will be.

In this case, it looked at first like this was a totally novel solution to something that hadn’t been solved before. On deeper search, Tao noted it’s almost trivial to prove with stuff Erdos knew, and also had been proved independently; this proof doesn’t use the prior proof mechanism though.