Remix clone Hacker News

new | show | ask | jobs Github

	▲	whimsicalism 6 days ago
		It is still an open question whether RL will (at least easily) scale the same way as pretrain or whether it is more effective at elicitation.