Remix.run Logo
PaulHoule 4 days ago

People have to quit this kind of stumbling in the dark with commercial LLMs.

To get to the bottom of this it would be interesting to train LLMs on nothing but chess games (can synthesize them endlessly by having Stockfish play against itself) with maybe a side helping of chess commentary and examples of chess dialogs “how many pawns are on the board?”, “where are my rooks?”, “draw the board”, competence at which would demonstrate that it has a representation of the board.

I don’t believe in “emergent phenomena” or that the general linguistic competence or ability to feign competence is necessary for chess playing (being smart at chess doesn’t mean you are smart at other things and vice versa). With experiments like this you might prove me wrong though.

This paper came out about a week ago

https://arxiv.org/pdf/2411.06655

seems to get good results with a fine-tuned Llama. I also like this one as it is about competence in chess commentary

https://arxiv.org/abs/2410.20811

toxik 3 days ago | parent [-]

Predicting next moves of some expert chess policy is just imitation learning, a well-studied proposal. You can add return-to-go to let the network try to learn what kinds of moves are made in good vs bad games, which would be an offline RL regime (eg, Decision Transformers).

I suspect chess skill is completely useless for LLMs in general and not an emergent phenomenon, just consuming gradient bandwidth and parameter space to do this neat trick. This is clear to me because the LLMs that aren't trained specifically on chess do not do chess well.

PaulHoule 3 days ago | parent [-]

In either language or chess I'm still a bit baffled how a representation over continuous variables (differentiable no less) works for something that is discrete such as words, letters, chess moves, etc. Add the word "not" a sentence and it is not a perturbation of the meaning but a reversal (or is it?)

A difference between communication and chess is that your partner in conversation is your ally in meaning making and will help fix your mistakes which is how they get away with bullshitting. ("Personality" makes a big difference, by the time you are telling your programming assistant "Dude, there's a red squiggle on line 92" you are under its spell)

Chess on the other hand is adversarial and your mistakes are just mistakes that your opponent will take advantage of. If you make a move and your hunch that your pieces are not in danger is just slightly wrong (one piece in danger) that's almost as bad as having all your non-King pieces in danger (they can only take one next turn.)