Remix.run Logo
rcxdude 2 days ago

They are clearly getting to useful and meaningful results with at a rate significantly better than chance (for example, the fact that ChatGPT can play chess well even though it sometimes tries to make illegal moves shows that there is a lot more happening there than just picking moves uniformly at random). Demanding perfection here seems to be odd given that humans also can make bizarre errors in reasoning (of course, generally at a lower rate and in a distribution of kinds of errors we are more used to dealing with).

matthewkayin 2 days ago | parent [-]

The fact that a model trained on the internet, on which the correct rules of chess are written, is unable to determine what is and is not a legal move, seems like a sign that these models are not reasoning about the questions asked of them. They are just giving responses that look like (and often are) correct chess moves.

rcxdude 2 days ago | parent [-]

It's a sign that they are 'reasoning' imperfectly. If they were just giving responses that 'looked like' chess moves, they would be very bad at playing chess.

(And I would hazard a guess that they are a primarily learning chess from the many games that are posted, as opposed to working things out from the rules. Indeed, if you make up a game and tell chatGPT the rules, it tends to be even worse at following them, let alone figuring out optimal play. But again, it will do so significantly better than random chance, so it's doing something with the information you give it, even if it's not doing so very well. I think it's reasonable to call this thinking, or reasoning, but this mostly becomes an argument of semantics. either way they do it significantly better than random chance but still not tremendously well. If your expectation is that they cannot work with anything novel then you're going to be continually surprised, but if your expectation is that they're as good as a human that has 'learned' from all the material its been given, especially material that's in-context and not in the training data, then you're also going to be disappointed.)