Remix.run Logo
a1j9o94 5 hours ago

You would only use the base model during training. This is a distillation technique