Remix.run Logo
stonemetal12 3 days ago

It isn't a binary does\doesn't question. It is a question of frequency and "quality" of mistakes. If it is making illegal moves 0.1% of the time then sure everybody makes mistakes. If it is 30% of the time then it isn't doing so well. If the illegal moves it tries to make are basic "pieces don't move like that" sort of errors then the predict next token isn't predicting so well. If the legality of the moves is more subtle then maybe it isn't too bad.

But more than being able to make moves, if we claim it understands chess shouldn't be able to explain why it chose a move over another move?