Remix.run Logo
libraryofbabel 10 days ago

> they clearly don't have any world model whatsoever

Then how did an LLM get gold on the mathematical Olympiad, where it certainly hadn’t seen the questions before? How on earth is that possible without a decent working model of mathematics? Sure, LLMs might make weird errors sometimes (nobody is denying that), but clearly the story is rather more complicated than you suggest.

simiones 8 days ago | parent [-]

> where it certainly hadn’t seen the questions before?

What are you basing this certainty on?

And even if you're right that the specific questions had not come up, it may still be that the questions from the math olympiad were rehashes of similar questions in other texts, or happened to correspond well to a composition of some other problems that were part of the training set, such that the LLM could 'pick up' on the similarity.

It's also possible that the LLM was specifically trained on similar problems, or may even have a dedicated sub-net or tool for it. Still impressive, but possibly not in a way that generalizes even to math like one might think based on the press releases.

eru 8 days ago | parent | next [-]

> What are you basing this certainty on?

People make up new questions for each IMO.

fxtentacle 7 days ago | parent [-]

Didn’t OpenAI get caught bribing their way to pre-tournament access of the questions?

eru 7 days ago | parent [-]

This is the first time I hear about this. (It's certainly possible, but I'd need to see some evidence or at least a write-up.)

OpenAI got flamed over announcing their results before the embargo was up:

IMO had asked companies to wait at least a week or so after the human winners were announced to announce the AI results. OpenAI did not wait.

libraryofbabel 8 days ago | parent | prev [-]

Like the other reply said, each exam has entirely new questions which are of course secret until the test is taken.

Sure, the questions were probably in a similar genre as existing questions or required similar techniques that could be found in solutions that are out there. So what? You still need some kind of world model of mathematics in which to understand the new problem and apply the different techniques to solve it.

Are you really claiming that SOTA LLMs don’t have any world model of mathematics at all? If so, can you tell us what sort of example would convince you otherwise? (Note that the ability to do novel mathematics research is setting the bar too high, because many capable mathematics majors never get to that point, and they clearly have a reasonable model of mathematics in their heads.)