Remix.run Logo
nl 3 hours ago

Model distillation is very useful!

Put it like this: Reinforcement Learning from Human Feedback (RLHF) is useful with hundreds of examples, and LLM distillation is basically the same thing.