▲ | PaulHoule 4 days ago | |||||||
People have to quit this kind of stumbling in the dark with commercial LLMs. To get to the bottom of this it would be interesting to train LLMs on nothing but chess games (can synthesize them endlessly by having Stockfish play against itself) with maybe a side helping of chess commentary and examples of chess dialogs “how many pawns are on the board?”, “where are my rooks?”, “draw the board”, competence at which would demonstrate that it has a representation of the board. I don’t believe in “emergent phenomena” or that the general linguistic competence or ability to feign competence is necessary for chess playing (being smart at chess doesn’t mean you are smart at other things and vice versa). With experiments like this you might prove me wrong though. This paper came out about a week ago https://arxiv.org/pdf/2411.06655 seems to get good results with a fine-tuned Llama. I also like this one as it is about competence in chess commentary | ||||||||
▲ | toxik 3 days ago | parent [-] | |||||||
Predicting next moves of some expert chess policy is just imitation learning, a well-studied proposal. You can add return-to-go to let the network try to learn what kinds of moves are made in good vs bad games, which would be an offline RL regime (eg, Decision Transformers). I suspect chess skill is completely useless for LLMs in general and not an emergent phenomenon, just consuming gradient bandwidth and parameter space to do this neat trick. This is clear to me because the LLMs that aren't trained specifically on chess do not do chess well. | ||||||||
|