Remix.run Logo
Language models transmit behavioural traits through hidden signals in data(nature.com)
4 points by armcat 9 hours ago | 4 comments
9 hours ago | parent | next [-]
[deleted]
zahra_lahrsson 9 hours ago | parent | prev | next [-]

Related to this: https://www.nature.com/articles/d41586-026-00906-0 (LLMs can subliminally learn malicious behavior through distilling)

pop_mccoy 9 hours ago | parent | prev | next [-]

Explains the high performance of distilled models then (e.g. Chinese ones).

sourdoughbob 9 hours ago | parent | prev [-]

[dead]