▲ | mjr00 5 days ago | ||||||||||||||||||||||
It's trivial to demonstrate that LLMs are pattern matching rather than reasoning. A good way is to provide modified riddles-that-aren't. As an example: > Prompt: A man working at some white collar job gets an interview scheduled with an MBA candidate. The man says "I can't interview this candidate, he's my son." How is this possible? > ChatGPT: Because the interviewer is the candidate’s mother. (The riddle plays on the assumption that the interviewer must be a man.) This is clearly pattern matching and overfitting to the "doctor riddle" and a good demonstration of how there's no actual reasoning going on. A human would read the prompt and initially demonstrate confusion, which LLMs don't demonstrate because they don't actually reason. | |||||||||||||||||||||||
▲ | Workaccount2 5 days ago | parent | next [-] | ||||||||||||||||||||||
Over fitting isn't evidence of non-reasoning, but that aside, what's interesting is that ChatGPT (free) trips on this, as did older models. But GPT-5 thinking, Opus 4, and Gemini 2.5 Pro all pointed out that there is no trick and it's likely the man just views it as a conflict of interest to interview his son. It's hard to say whether this has been trained out (it's an old example) or if it's just another hurdle that general model progression has overcome. | |||||||||||||||||||||||
▲ | 2ap 5 days ago | parent | prev | next [-] | ||||||||||||||||||||||
OK. But, in Claude Sonnet 4: 'This is possible because the man is the candidate's father. When he says "he's my son," he's simply stating their family relationship. The scenario doesn't present any logical contradiction - a father could very well be in a position where he's supposed to interview his own son for a job. This would create a conflict of interest, which is why he's saying he can't conduct the interview. It would be inappropriate and unfair for a parent to interview their own child for a position, so he would need to recuse himself and have someone else handle the interview. The phrasing might initially seem like it's setting up a riddle, but it's actually a straightforward situation about professional ethics and avoiding conflicts of interest in hiring.' EDIT - this is described better by other posters. | |||||||||||||||||||||||
▲ | naasking 5 days ago | parent | prev | next [-] | ||||||||||||||||||||||
> It's trivial to demonstrate that LLMs are pattern matching rather than reasoning. Again, this is just asserting the premise that reasoning cannot include pattern matching, but this has never been justified. What is your definition for "reasoning"? > This is clearly pattern matching and overfitting to the "doctor riddle" and a good demonstration of how there's no actual reasoning going on. Not really, no. "Bad reasoning" does not entail "no reasoning". Your conclusion is simply too strong for the evidence available, which is why I'm asking for a rigourous definition of reasoning that doesn't leave room for disagreement about whether pattern matching counts. | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | DenisM 5 days ago | parent | prev | next [-] | ||||||||||||||||||||||
We kinda move from the situation “LLM can only do what it seen before” to “LLM can do something by composing several things it has seen before”. We didn’t get to the situation “LLM can do things it has not seen before”. The practicality of the situation is that a lot of problems fall into the second bucket. We all like to think we deal with novel problems, but most of what we can think of was already considered by another human and captured by llm. You had to invent something deliberately unique, and that’s telling. Most startup ideas are invented more than once, for example. The key shortcoming of the llm is that it is not aware of its own limits. If it ever becomes aware it can outsource such rare things to mechanical Turk. | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | adastra22 5 days ago | parent | prev [-] | ||||||||||||||||||||||
People make the same sort of mistakes. | |||||||||||||||||||||||
|