Remix.run Logo
bmitc 2 hours ago

Does anyone know about the jailbreaks and attacks they are referring to? These are done through model queries?

deminature 2 hours ago | parent | next [-]

One of the major attack vectors is distillation, where millions of questions are auto-generated and coordinated to produce training data for new LLMs. Anthropic alleges Minimax, Deepseek and Kimi were trained this way. Deepseek 4 compares favorably to Opus, so they're probably trying to prevent Deepseek 5 from being a bootleg Mythos. https://www.anthropic.com/news/detecting-and-preventing-dist...

pseudosavant an hour ago | parent | next [-]

It takes a lot of audacity to train on all the data you can without any license, attribution, etc and then act like you can own the outputs of the model so that someone else doesn't make a model from your data without a license. I've lost a lot of respect for Anthropic in the last 24 hours.

zarzavat an hour ago | parent [-]

Everyone knows it's bullshit but because these companies are being valued at a trillion dollars a piece, it's hard to say that if you were in their shoes you'd do any differently.

shimman 41 minutes ago | parent [-]

This may surprise the cohort on hacker news but there are large amounts of people on this planet that value things beyond money like ethics or having principles. Excusing absolutely repugnant behavior because of money to be made is so deeply antihuman, but then again most people working at LLM companies are deeply antihuman to start with.

anon373839 an hour ago | parent | prev [-]

Distillation is not an "attack", despite Anthropic themselves coining the self-serving phrase "distillation attack". And as others have noted, it is precisely identical to the sort of "attack" on published works which Anthropic themselves used to train their models.

dannyw a few seconds ago | parent [-]

Agreed. Distillation is as much of an attack as scraping is an attack ;)

MichaelZuo 2 hours ago | parent | prev [-]

Why would you trust anything they say at face value?

When they literally just showed you they are being deceptive by sneaking in the weasel word “almost”?

alexjurkiewicz an hour ago | parent [-]

Firstly, none of this post is the contract people are signing. So it's merely a summary.

Secondly, like all contracts I'm sure there will be exceptions for holding data longer than 30 days with reasonable cause, eg a legal hold.