Remix.run Logo
fijiaarone 3 days ago

It's very clear that GPT-3.5-turbo-whatever is cheating and that LLMs cannot, in fact play chess. It was trained in sequences of chess moves and has explicit coding to recognize chess moves and respond accordingly. If you only define "cheating" as calling a chess engine like stockfish, then your definition of cheating is too narrow.

It's exactly like the strawberry problem. No LLM can count the letters in a word. But when shown that, they explicitly were taught to recognize the prompt and count the letters in the word. They didn't create a letter counting algorithm, but they did build a table of words and letter counts. And every "new" LLM explicitly looks for a phrase that looks like "how many Rs are in stawberry" and then the LLM looks in the "letters in words" neural network instead of the the "what is the next likely word in this sentence net".

All "new" LLMs (in the next few weeks) will suddenly become decent at chess because they will have weighted preference to look at the "chess moves" neural net instead of the "random numbers and letters sequence" neural net when they detect a sentence that looks like "d4, d5; nd3, ?" etc.