Remix.run Logo
jbritton 5 days ago

I recently saw an article about LLMs and Towers of Hanoi. An LLM can write code to solve it. It can also output steps to solve it when the disk count is low like 3. It can’t give the steps when the disk count is higher. This indicates LLMs inability to reason and understand. Also see Gotham Chess and the Chatbot Championship. The Chatbots start off making good moves, but then quickly transition to making illegal moves and generally playing unbelievably poorly. They don’t understand the rules or strategy or anything.

leptons 5 days ago | parent | next [-]

Could the LLM "write code to solve it" if no human ever wrote code to solve it? Could it output "steps to solve it" if no human ever wrote about it before to have in its training data? The answer is no.

chpatrick 4 days ago | parent [-]

Could a human code the solution if they didn't learn to code from someone else? No. Could they do it if someone didn't tell them the rules of towers of hanoi? No.

That doesn't mean much.

Gee101 4 days ago | parent | next [-]

It does since humans where able to invent a programming language.

chpatrick 4 days ago | parent [-]

Have you tried asking a modern LLM to invent a programming language?

CamperBob2 4 days ago | parent [-]

Have you? If so, how'd it go? Sounds like an interesting exercise.

chpatrick 4 days ago | parent [-]

https://g.co/gemini/share/0dd589b0f899

leptons 4 days ago | parent | prev [-]

A human can learn and understand the rules, an LLM never could. LLMs have famously been incapable of beating humans in chess, a seemingly simple thing to learn, because LLMs can't learn - they just predict the next word and that isn't helpful in solving actual problems, or playing simple games.

chpatrick 4 days ago | parent [-]

Actually general-purpose LLMs are pretty decent at playing chess games they haven't seen before: https://maxim-saplin.github.io/llm_chess/

naasking 4 days ago | parent | prev | next [-]

> This indicates LLMs inability to reason and understand.

No it doesn't, this is an overgeneralization.

tim333 4 days ago | parent | prev [-]

I think if you tried that with some random humans you'd also find quite a few fail. I'm not sure if that shows humans have an inability to reason and understand although sometimes I wonder.