It generated a proof that was close enough to something in its training data to be generated.

That may be, and we can debate the level of novelty, but it is novel, because this exact proof didn't exist before, something which many claim was not possible with AI. In fact, just a few years ago, based on some dabbling in NLP a decade ago, I myself would not have believed any of this was remotely possible within the next 3 - 5 decades at least.

I'm curious though, how many novel Math proofs are not close enough to something in the prior art? My understanding is that all new proofs are compositions and/or extensions of existing proofs, and based on reading pop-sci articles, the big breakthroughs come from combining techniques that are counter-intuitive and/or others did not think of. So roughly how often is the contribution of a proof considered "incremental" vs "significant"?

	▲	tovej 3 hours ago \| parent [-]
		Well, for one the proof would have to use actual proof techniques. What really happened here was that the LLM produced a python script that generated examples of hypergraphs that served as proof by example. And the only thing that has been verified are these examples. The LLM also produced a lot of mathematical text that has not been analyzed.

▲

qnleigh 6 hours ago | parent | prev [-]

Do you know that from reading the proof, or are you just assuming this based on what you think LLMs should be capable of? If the latter, what evidence would be required for you to change your mind?

- Edit: I can't reply, probably because the comment thread isn't allowed to go too deep, but this is a good argument. In my mind the argument isn't that coding is harder than math, but that the problems had resisted solution by human researchers.

▲

tovej 5 hours ago | parent [-]

1) this is a proof by example 2) the proof is conducted by writing a python program constructing hypergraphs 3) the consensus was this was low-hanging fruit ready to be picked, and tactics for this problem were available to the LLM

So really this is no different from generating any python program. There are also many examples of combinatoric construction in python training sets.

It's still a nice result, but it's not quite the breakthrough it's made out to be. I think that people somehow see math as a "harder" domain, and are therefore attributing more value to this. But this is a quite simple program in the end.

	▲	zingar 5 hours ago \| parent [-]
		One of the possible outcomes of this journey is that “LLMs can never do X”. Another is that X is easier than we thought.