▲ | Certhas 4 days ago | ||||||||||||||||
What do you mean by "understand chess"? I think you don't appreciate how good the level of chess displayed here is. It would take an average adult years of dedicated practice to get to 1800. The article doesn't say how often the LLM fails to generate legal moves in ten tries, but it can't be often or the level of play would be much much much worse. As seems often the case, the LLM seems to have a brilliant intuition, but no precise rigid "world model". Of course words like intuition are anthropomorphic. At best a model for what LLMs are doing. But saying "they don't understand" when they can do _this well_ is absurd. | |||||||||||||||||
▲ | photonthug 3 days ago | parent | next [-] | ||||||||||||||||
> But saying "they don't understand" when they can do _this well_ is absurd. When we talk about understanding a simple axiomatic system, understanding means exactly that the entirety of the axioms are modeled and applied correctly 100% of the time. This is chess, not something squishy like literary criticism. There’s no need to debate semantics at all. One illegal move is a deal breaker Undergraduate CS homework for playing any game with any technique would probably have the stipulation that any illegal move disqualifies the submission completely. Whining that it works most of the time would just earn extra pity/contempt as well as an F on the project. We can argue whether an error rate of 1 in a million means that it plays like a grandmaster or a novice, but that’s less interesting. It failed to model a simple system correctly, and a much shorter/simpler program could do that. Doesn’t seem smart if our response to this as an industry is to debate semantics, ignore the issue, and work feverishly to put it to work modeling more complicated / critical systems. | |||||||||||||||||
| |||||||||||||||||
▲ | vundercind 3 days ago | parent | prev | next [-] | ||||||||||||||||
> I think you don't appreciate how good the level of chess displayed here is. It would take an average adult years of dedicated practice to get to 1800. Since we already have programs that can do this, that definitely aren’t really thinking and don’t “understand” anything at all, I don’t see the relevance of this part. | |||||||||||||||||
| |||||||||||||||||
▲ | 4 days ago | parent | prev | next [-] | ||||||||||||||||
[deleted] | |||||||||||||||||
▲ | ChoHag 4 days ago | parent | prev [-] | ||||||||||||||||
[dead] |