| ▲ | Language models transmit behavioural traits through hidden signals in data(nature.com) | |
| 4 points by armcat 9 hours ago | 4 comments | ||
| ▲ | 9 hours ago | parent | next [-] | |
| [deleted] | ||
| ▲ | zahra_lahrsson 9 hours ago | parent | prev | next [-] | |
Related to this: https://www.nature.com/articles/d41586-026-00906-0 (LLMs can subliminally learn malicious behavior through distilling) | ||
| ▲ | pop_mccoy 9 hours ago | parent | prev | next [-] | |
Explains the high performance of distilled models then (e.g. Chinese ones). | ||
| ▲ | sourdoughbob 9 hours ago | parent | prev [-] | |
[dead] | ||