Remix clone Hacker News

new | show | ask | jobs Github

	▲	__alexs 4 hours ago
		This hasn't been the full story for years now. All SOTA models are strongly post-trained with reinforcement learning to improve performance on specific problems and interaction patterns. The vast majority of this training data is generated synthetically.