| ▲ | bel8 8 hours ago | |
I don't think China would strugle to scrape the internet for fresh data. And they constantly publish state of the art LLM research (see DS4 context compaction and cache tech). They have very capable tech giants. So while not being able to distill western models would probably have some impact, it's probably becoming lesser as time passes. We might even see Western LLMs distilling Chinese models soon. If they aren't already to some extent. | ||
| ▲ | hnfong an hour ago | parent [-] | |
Everyone distills/copies training data. A couple months ago when Anthropic was complaining about Chinese distillation, people found that Claude self-identified as "DeepSeek" when asked in Chinese: https://x.com/stevibe/status/2026227392076018101 It's really a fiasco of massive hypocrisy at this point. | ||