Remix.run Logo
marcus_holmes 4 days ago

I notice there's no prompt saying "you should try to win the game" yet the results are measured by how much the LLM wins.

Is this implicit in the "you are a grandmaster chess player" prompt?

Is there some part of the LLM training that does "if this is a game, then I will always try to win"?

Could the author improve the LLM's odds of winning just by telling it to try and win?

tinco 4 days ago | parent | next [-]

I think you're putting too much weight on its intentions, it doesn't have intentions it is a mathematical model that is trained to give the most likely outcome.

In almost all examples and explanations it has seen from chess games, each player would be trying to win, so it is simply the most logical thing for it to make a winning move. So I wouldn't expect explicitly prompting it to win to improve its performance by much if at all.

The reverse would be interesting though, if you would prompt it to make losing/bad moves, would it be effective in doing so, and would the moves still be mostly legal? That might reveal a bit more about how much relies on concepts it's seen before.

graypegg 3 days ago | parent | next [-]

Might also be interesting to see if mentioning a target ELO score actually works over enough simulated games. I can imagine there might be regular mentions of a player's ELO score near their match history in the training data.

That way you're trying to emulate cases where someone is trying, but isn't very good yet, versus trying to emulate cases where someone is clearly and intentionally losing which is going to be orders of magnitude less common in the training data. (And I also would bet "losing" is also a vector/token too closely tied to ANY losing game, but those players were still putting up a fight to try and win the game. Could still drift towards some good moves!)

4 days ago | parent | prev [-]
[deleted]
Nashooo 4 days ago | parent | prev | next [-]

IMO this is clearly implicit in the "you are a grandmaster chess player" prompt. As that should make generating best possible move tokens more likely.

Ferret7446 4 days ago | parent | next [-]

Is it? What if the AI is better than a grandmaster chess player and is generating the most likely next move that a grandmaster chess player might make and not the most likely move to win, which may be different?

lukan 4 days ago | parent [-]

Depends on the training data I think. If the data divides in games by top chess engines - and human players, then yes, it might make a difference to tell it, to play like a grandmaster of chess vs. to play like the top chess engine.

cma 3 days ago | parent | prev [-]

Grandmasters usually play grandmasters of similar ELO, so it might think it doesn't always win. Even if it should recognize the player isn't a grandmaster, it still may be better to include that, though who knows without testing.

tananan 3 days ago | parent | prev | next [-]

It would surely just be fluff in the prompt. The model's ability to generate chess sequences will be bounded by the expertise in the pool of games in the training set.

Even if the pool was poisoned by games in which some players are trying to lose (probably insignificant), no one annotates player intent in chess games, and so prompting it to win or lose doesn't let the LLM pick up on this.

You can try this by asking an LLM to play to lose. ChatGPT ime tries to set itself up for scholar's mate, but if you don't go for it, it will implicitly start playing to win (e.g. taking your unprotected pieces). If you ask it "why?", it gives you the usual bs post-hoc rationalization.

danw1979 3 days ago | parent [-]

> It would surely just be fluff in the prompt. The model's ability to generate chess sequences will be bounded by the expertise in the pool of games in the training set.

There are drawn and loosing games in the training set though.

montjoy 4 days ago | parent | prev | next [-]

I came to the comments to say this too. If you were prompting it to generate code, you generally get better results when you ask it for a result. You don’t just tell it, “You are a python expert and here is some code”. You give it a direction you want the code to go. I was surprised that there wasn’t something like, “and win”, or, “black wins”, etc.

boredhedgehog 3 days ago | parent | prev [-]

Further, the prompt also says to "choose the next move" instead of the best move.

It would be fairly hilarious if the reinforcement training has made the LLM unwilling to make the human feel bad through losing a game.