Remix.run Logo
simianwords 2 hours ago

how would you empirically disprove that it doesn't have understanding?

i can prove that it does have understanding because it behaves exactly like a human with understanding does. if i ask it to solve an integral and ask it questions about it - it replies exactly as if it has understood.

give me a specific example so that we can stress test this argument.

for example: what if we come up with a new board game with a completely new set of rules and see if it can reason about it and beat humans (or come close)?

bigfishrunning 2 hours ago | parent [-]

We don't need to come up with a new board game. How about a board game that has been written about extensively for hundreds of years

LLMs can't consistently win at chess https://www.nicowesterdale.com/blog/why-llms-cant-play-chess

Now, some of the best chess engines in the world are Neural Networks, but general purpose LLMs are consistently bad at chess.

As far as "LLM's don't have understanding", that is axiomatically true by the nature of how they're implemented. A bunch of matrix multiplies resulting in a high-dimensional array of tokens does not think; this has been written about extensively. They are really good for generating language that looks plausible; some of that plausable-looking language happens to be true.

simianwords 2 hours ago | parent [-]

false, chess ELO is pretty good

https://maxim-saplin.github.io/llm_chess/

ets not cherry pick and actually see benchmarks please. i would say even ~1000 elo means that it can reason better than the average human.

bigfishrunning 2 hours ago | parent [-]

If you look at the "workflow" section of that page, they had to add a bunch of scaffolding around telling the model what moves are legal -- an llm can't keep enough context to know how to play chess; only to choose an advantageous move from a given list. But feel free to "cherry pick".

simianwords 2 hours ago | parent [-]

why do you think this falsifies that it can't reason?