Remix clone Hacker News

new | show | ask | jobs Github

	▲	yosefk 7 days ago
		The post or rather the part you refer to is based on a simple experiment which I encourage you to repeat. (It is way likelier to reproduce in the short to medium run than the others.) From your link: "...The first was gpt-3.5-turbo-instruct's ability to play chess at 1800 Elo" These things don't play at 1800 ELO, though maybe someone measured this ELO without cheating but rather relying on some artifacts of how an engine told to play at a low rating does against an LLM (engines are weird when you ask them to play badly, as a rule); a good start to a decent measurement would be to try it on chess 960. These things do lose track of the pieces in 10 moves. (As do I absent a board to look at, but I understand enough to say "I can't play blindfold chess, let's set things up so I can look at the current position somehow")
	▲	og_kalu 7 days ago \| parent [-]
		>These things don't play at 1800 ELO Why are you saying 'these things'?. That statement is about a specific model which did play at that level and did not lose track of the pieces. There's no cheating or weirdness. https://github.com/adamkarvonen/chess_gpt_eval