Remix.run Logo
rcxdude 2 days ago

>While Anthropic and OpenAI are still trying to make sense of what China's top computer scientists pulled off a year ago

The whole reason they're accusing them of distilling their models is that this was a well-known technique that's relatively easy compared to creating or improving on one in the first place. Deepseek was impressive for how lean it was (and it shook the markets because it demonstrated obviously what the savvier observers already had figured, that the big AI companies in the US didn't have a huge moat), but they certainly did not come up with this concept.

pyman 2 days ago | parent [-]

OpenAI raised $40 billion and Anthropic raised $10 billion, claiming they needed the money to buy more expensive Nvidia servers to train bigger models. Then Chinese experts basically said, no you don't. And they proved it.

ben_w 2 days ago | parent [-]

More like the Egg of Columbus or the Red Queen.

You need to run as hard as you can just to stay where you are, and once you've got the answer it's very much easier to reproduce the result.

This is of course also what annoys a certain fraction of commenters in every discussion about LLMs (and in art, diffusion models): they're overwhelmingly learning from the examples made by others, not investigating things for themselves.

While many scientists will have had an example like Katie Mack's viral tweet* with someone who doesn't know what "research" even is in the first place and also mistakes "first thing I read" for such research, the fact many humans also do this doesn't make the point wrong when it's about AI.

* https://paw.princeton.edu/article/katie-mack-09-taming-troll

pyman 2 days ago | parent [-]

So what are you trying to say?

Do you agree that OpenAI and Anthropic are still claiming they need more data centres and more Nvidia servers to win the AI race, while still trying to understand what China actually did and how they did it?

ben_w 2 days ago | parent [-]

"while" makes the whole false.

> Do you agree that OpenAI and Anthropic are still claiming they need more data centres and more Nvidia servers to win the AI race

Yes. Red Queen[0].

> while still trying to understand what China actually did and how they did it?

No. Egg of Columbus[1]. They're well aware of what DeepSeek did. Just as DeepSeek could easily reproduce American models, the DeepSeek models are not particularly challenging works for any other AI company to follow, understand, and build upon. Here's someone else's reproduction of what they did: https://huggingface.co/blog/open-r1

That it's so easy for these companies to keep up with each other is *the reason why* there's a Red Queen[0] race.

[0] https://en.wikipedia.org/wiki/Red_Queen's_race

[1] https://en.wikipedia.org/wiki/Egg_of_Columbus

pyman 2 days ago | parent [-]

Got it now, thanks for explaining.