Remix clone Hacker News

new | show | ask | jobs Github

	▲	beering 4 hours ago
		My experience has been that this isn’t generally true, mainly because worse models pursue red herrings or get confused and stuck. a better model will get to the correct solution in fewer tokens, and my surface-level understanding of how RL works supports this.