| ▲ | looobay 2 days ago | |||||||
There was research on LLMs training and distillation that if two models have a similar architecture (probably the case for Xai) the "master" model will distill knowledge to the model even if its not in the distillation data. So they probably need to train a new model from scratch. (sorry i don't remember the name but there was an example with a model liking howl to showcase this) | ||||||||
| ▲ | -_- 2 days ago | parent [-] | |||||||
Subliminal learning: https://alignment.anthropic.com/2025/subliminal-learning/ | ||||||||
| ||||||||