Remix.run Logo
cesarvarela 19 hours ago

Yeah, but 1800 FIDE players don't make illegal moves, and Gemini does.

dwohnitmok 16 hours ago | parent | next [-]

1800 FIDE players do make illegal moves. I believe they make about one to two orders of magnitude less illegal moves than Gemini 3 does here. IIRC the usual statistic for expert chess play is about 0.02% of expert chess games have an illegal move (I can look that up later if there's interest to be sure), but that is only the ones that made it into the final game notation (and weren't e.g. corrected at the board by an opponent or arbiter). So that should be a lower bound (hence why it could be up to one order lower, although I suspect two orders is still probably closer to the truth).

Whether or not we'll see LLMs continue to get a lower error rate to make up for those orders of magnitude remains to be seen (I could see it go either way in the next two years based on the current rate of progress).

cesarvarela 13 hours ago | parent [-]

A player at that level making an illegal move is either tired, distracted, drunk, etc. An LLM makes it because it does not really "understand" the rules of chess.

famouswaffles 18 hours ago | parent | prev [-]

That benchmark methodology isn't great, but regardless, LLMs can be trained to play Chess with a 99.8% legal move rate.

recursive 17 hours ago | parent [-]

That doesn't exactly sound like strong chess play.

dwohnitmok 15 hours ago | parent [-]

It's enough to reliably beat amateur (e.g. maia-1900) chess engines.