Reinforcement Learning from Human Feedback

Web version with links, etc:

Last time I saw Nathan say something about the book, he's actively working on the next version and looking for feedback, check his socials

	▲	leggerss an hour ago \| parent [-]
		You could say he's also learning from human feedback