One of the major attack vectors is distillation, where millions of questions are auto-generated and coordinated to produce training data for new LLMs. Anthropic alleges Minimax, Deepseek and Kimi were trained this way. Deepseek 4 compares favorably to Opus, so they're probably trying to prevent Deepseek 5 from being a bootleg Mythos. https://www.anthropic.com/news/detecting-and-preventing-dist...

▲

pseudosavant an hour ago | parent | next [-]

It takes a lot of audacity to train on all the data you can without any license, attribution, etc and then act like you can own the outputs of the model so that someone else doesn't make a model from your data without a license. I've lost a lot of respect for Anthropic in the last 24 hours.

▲

zarzavat an hour ago | parent [-]

Everyone knows it's bullshit but because these companies are being valued at a trillion dollars a piece, it's hard to say that if you were in their shoes you'd do any differently.

	▲	shimman 42 minutes ago \| parent [-]
		This may surprise the cohort on hacker news but there are large amounts of people on this planet that value things beyond money like ethics or having principles. Excusing absolutely repugnant behavior because of money to be made is so deeply antihuman, but then again most people working at LLM companies are deeply antihuman to start with.

▲

anon373839 an hour ago | parent | prev [-]

Distillation is not an "attack", despite Anthropic themselves coining the self-serving phrase "distillation attack". And as others have noted, it is precisely identical to the sort of "attack" on published works which Anthropic themselves used to train their models.

	▲	dannyw 2 minutes ago \| parent [-]
		Agreed. Distillation is as much of an attack as scraping is an attack ;)