Remix clone Hacker News

new | show | ask | jobs Github

	▲	ACCount37 3 days ago
		With better reasoning training, the models mitigate more and more of that entirely by themselves. They "diverge into a ditch" less, and "converge towards the right answer" more. They are able to use more and more test-time compute effectively. They bring their own supply of "wait". OpenAI's in-house reasoning training is probably best in class, but even lesser naive implementations go a long way.