Remix clone Hacker News

new | show | ask | jobs Github

	▲	teleforce 3 hours ago
		This is possible but not for training but fine-tuning the existing open source models. This can be mainstream, and then custom model fine-tuning becomes the new “software development”. Please check out this new fine-tuning method for LLM by MIT and ETH Zurich teams that used a single NVIDIA H200 GPU [1], [2], [3]. Full fine-tuning of the entire model’s parameters were performed based on the Hugging Face TRL library. [1] MIT's new fine-tuning method lets LLMs learn new skills without losing old ones (news): https://venturebeat.com/orchestration/mits-new-fine-tuning-m... [2] Self-Distillation Enables Continual Learning (paper): https://arxiv.org/abs/2601.19897 [3] Self-Distillation Enables Continual Learning (code): https://self-distillation.github.io/SDFT.html