Remix clone Hacker News

	▲	nodja a day ago
		I highly suspect that CoT tokens are at least partially working as register tokens. Have these big LLM trainers tried replacing CoT with a similar amount of register tokens and see if the improvements are similar?
	▲	wgd a day ago \| parent [-]
		I remember there was a paper a little while back which demonstrated that merely training a model to output "........" (or maybe it was spaces?) while thinking provided a similar improvement in reasoning capability to actual CoT.