Remix clone Hacker News

new | show | ask | jobs Github

	▲	catigula 15 hours ago
		It seems like you don’t understand reinforcement learning. The signal is reinforced because it correlates to behavior, hacking the signal itself is misalignment.