Remix clone Hacker News

new | show | ask | jobs Github

	▲	nl 3 hours ago
		Model distillation is very useful! Put it like this: Reinforcement Learning from Human Feedback (RLHF) is useful with hundreds of examples, and LLM distillation is basically the same thing.