| ▲ | pyman 2 days ago |
| This is exactly what the DeepSeek team did, and now Anthropic is repackaging it a year later, calling it “subliminal learning” or using the teacher and student analogy to take credit for work done by Chinese researchers. https://malted.ai/deepseek-and-the-future-of-distillation/ While Anthropic and OpenAI are still trying to make sense of what China's top computer scientists pulled off a year ago, something that shook the core of Nvidia's business, China is now showcasing the world's first commercial unhackable cryptography system using QKD and post-quantum cryptography to secure all phone calls between Beijing and Hefei. |
|
| ▲ | dwohnitmok 2 days ago | parent | next [-] |
| You're misunderstanding subliminal learning. Subliminal learning is a surprising result that sheds more light on the process of distillation. It's not Anthropic trying to take credit for distillation. In particular subliminal learning is the finding that a student model distilled from a teacher model has a communication channel with the teacher model that is extremely difficult to observe or oversee. If you later fine-tune the teacher model on a very specific thing (in Anthropic's case fine-tuning the teacher to prefer owls over other animals) and then simply prompt the teacher model to output "random" digits with no reference to owls whatsoever, simply training the student model on this stream of digits results in the student model also developing a preference for owls over other animals. This is a novel result and has a lot of interesting implications both for how distillation works as a mechanism and also for novel problems in overseeing AI systems. |
| |
| ▲ | pyman 2 days ago | parent [-] | | Sorry, I commented on the wrong article. I meant to post this under: https://alignment.anthropic.com/2025/subliminal-learning/ Regarding your comment, yes, it's well known in the ML world that machines are way better than humans at picking up on correlations. In other words, the output of a model can carry traces of its internal state, so if another model is trained on those outputs, it can end up learning the patterns behind them. What's contradictory is hearing companies say: "We wrote the software, but we don't fully understand what it's doing once it's trained on trillions of tokens. The complexity is so high that weird behaviours emerge." And yet, at the same time, they're offering an API to developers, startups, and enterprise customers as if it's totally safe and reliable while openly admitting they don't fully know what's going on under the hood. Question: Why did Anthropic made its API publicly available? to share responsibility and distribute the ethical risk with developers, startups, and enterprise customers, hoping that widespread use would eventually normalise training models on copyrighted materials and influence legal systems over time? Why are they saying "we don't know what's going on, but here's our API"? It's like Boeing saying: "Our autopilot's been acting up in unpredictable ways lately, but don't worry, your flight's on time. Please proceed to the gate.” So many red flags. |
|
|
| ▲ | rcxdude 2 days ago | parent | prev | next [-] |
| >While Anthropic and OpenAI are still trying to make sense of what China's top computer scientists pulled off a year ago The whole reason they're accusing them of distilling their models is that this was a well-known technique that's relatively easy compared to creating or improving on one in the first place. Deepseek was impressive for how lean it was (and it shook the markets because it demonstrated obviously what the savvier observers already had figured, that the big AI companies in the US didn't have a huge moat), but they certainly did not come up with this concept. |
| |
| ▲ | pyman 2 days ago | parent [-] | | OpenAI raised $40 billion and Anthropic raised $10 billion, claiming they needed the money to buy more expensive Nvidia servers to train bigger models. Then Chinese experts basically said, no you don't. And they proved it. | | |
| ▲ | ben_w 2 days ago | parent [-] | | More like the Egg of Columbus or the Red Queen. You need to run as hard as you can just to stay where you are, and once you've got the answer it's very much easier to reproduce the result. This is of course also what annoys a certain fraction of commenters in every discussion about LLMs (and in art, diffusion models): they're overwhelmingly learning from the examples made by others, not investigating things for themselves. While many scientists will have had an example like Katie Mack's viral tweet* with someone who doesn't know what "research" even is in the first place and also mistakes "first thing I read" for such research, the fact many humans also do this doesn't make the point wrong when it's about AI. * https://paw.princeton.edu/article/katie-mack-09-taming-troll | | |
| ▲ | pyman 2 days ago | parent [-] | | So what are you trying to say? Do you agree that OpenAI and Anthropic are still claiming they need more data centres and more Nvidia servers to win the AI race, while still trying to understand what China actually did and how they did it? | | |
| ▲ | ben_w 2 days ago | parent [-] | | "while" makes the whole false. > Do you agree that OpenAI and Anthropic are still claiming they need more data centres and more Nvidia servers to win the AI race Yes. Red Queen[0]. > while still trying to understand what China actually did and how they did it? No. Egg of Columbus[1]. They're well aware of what DeepSeek did. Just as DeepSeek could easily reproduce American models, the DeepSeek models are not particularly challenging works for any other AI company to follow, understand, and build upon. Here's someone else's reproduction of what they did: https://huggingface.co/blog/open-r1 That it's so easy for these companies to keep up with each other is *the reason why* there's a Red Queen[0] race. [0] https://en.wikipedia.org/wiki/Red_Queen's_race [1] https://en.wikipedia.org/wiki/Egg_of_Columbus | | |
|
|
|
|
|
| ▲ | anonymoushn 2 days ago | parent | prev | next [-] |
| "subliminal learning" does not even work for use cases like distilling o1 to R1 because they do not share a base model |
| |
| ▲ | pyman 2 days ago | parent [-] | | Who's talking about that? [Edit] My bad, I thought I was commenting on Anthropic's article | | |
| ▲ | anonymoushn 2 days ago | parent [-] | | i replied to a comment by the hacker news user called pyman which claimed incorrectly that distillation was repackaged as "subliminal learning". so if you are asking me, who is talking about subliminal learning, which is unrelated to the topic of the article, the answer is that the hacker news user called pyman is doing that. | | |
|
|
|
| ▲ | danieldk 2 days ago | parent | prev | next [-] |
| This is exactly what the DeepSeek team did, and now Anthropic is repackaging it a year later, calling it “subliminal learning” or using the teacher and student analogy to take credit for work done by Chinese researchers. What? Distillation is way older. The Hinton paper was from 2015 (maybe there is even earlier work): https://arxiv.org/abs/1503.02531 When I was still in academia, we were distilling models from BERT/RoBERTa-large to smaller models (remember when those models were considered large?) in 2019 using logits and L2 distance of hidden layers. Before that we were also doing distillation of our own transformer/lstm models on model outputs (though with a different motivation than model compression, to learn selectional preferences, etc.). |
| |
| ▲ | pyman 2 days ago | parent [-] | | My point is: OpenAI raised $40 billion and Anthropic raised $10 billion, claiming they needed the money to buy more expensive Nvidia servers to train bigger models. Then Chinese experts basically said, no you don't. And they proved it. |
|
|
| ▲ | 2 days ago | parent | prev | next [-] |
| [deleted] |
|
| ▲ | 2 days ago | parent | prev | next [-] |
| [deleted] |
|
| ▲ | ACCount36 2 days ago | parent | prev [-] |
| [flagged] |