Remix.run Logo
vrganj 8 hours ago

Anthropic stole the entire internet. Excuse my language, but they can fuck right off.

breppp 7 hours ago | parent [-]

The issue here is not whether Anthropic used Common Crawl, Alibaba also does that.

The issue is that by distilling Claude, Alibaba reuses the IP anthropic used to train the model that's more akin to historical Chinese reverse engineering methods and disrespect of IP

wongarsu 2 hours ago | parent | next [-]

If using Common Crawl or Anna's Archive in your training data is legal, then surely the same is true for using conversations with Claude. I don't see a reasonable framework where training AI on copyrighted data is ok if and only if that data is not generated by AI

(granted, only meta got caught using Anna's Archive, but it seems safe to assume it's common practice. And even if it wasn't, the websites in Common Crawl are still covered by copyright)

snovv_crash 7 hours ago | parent | prev | next [-]

Alibaba paid for that data though, right? They didn't hack Anthropic, they bought accounts and ran them normally.

Also, you can't copyright AI outputs. So worst case they violated the ToS.

causal 2 hours ago | parent | prev | next [-]

I wish people would stop using Anthropics incorrect use of the term distill. They don’t share logits so you can’t distill. You can generate training data, which doesn’t sound nearly so scary.

blackoil 7 hours ago | parent | prev | next [-]

'Issue' for who?

matheusmoreira 7 hours ago | parent | prev | next [-]

> reuses the IP anthropic used to train the model

> disrespect of IP

Nobody other than Anthropic cares.

messe 7 hours ago | parent | prev | next [-]

> Alibaba reuses the IP anthropic used to train the model that's more akin to historical Chinese reverse engineering methods and disrespect of IP

Why is this any worse than Anthropic's disrepect of IP? You've apparently drawn a distinction between the two here, but I'm failing to see what it actually is.

breppp 5 hours ago | parent [-]

Copyright law and IP law is not the same although everyone seem to conflate the two.

Search engines for example historically ignored copyright law by copying excerpts or serving other site images, it doesn't mean someone copying Google's code has some moral frepass

messe 5 hours ago | parent | next [-]

> Copyright law and IP law is not the same although everyone seem to conflate the two.

Copyright law is a subset of IP law. What IP is being infringed upon here?

> Search engines for example historically ignored copyright law by copying excerpts or serving other site images

Excerpts are often considered fair use, but it depends on country.

> it doesn't mean someone copying Google's code has some moral frepass

Nobody copied Anthropic's code. They used it's output to train another model. At most they violated some terms of service.

Did they maybe abuse Anthropic's subsidised pricing? Sure. But that's what happens in a free market if you sell below cost.

breppp 5 hours ago | parent [-]

> Excerpts are often considered fair use, but it depends on country.

That had happened progressively, thumbnails for example were ruled as fair use later on, DMCA safe harbor was a huge gift for tech companies because otherwise it would curtail the ability to create platforms (relaxing copyright protections in exchange of innovation)

> Nobody copied Anthropic's code. They used it's output to train another model. At most they violated some terms of service

Distilling a model is a method that can push the entire market to low margins and prevent companies from making money off such research. It also copies the Anthropic special parts (RLHF and other specific methods) rather than the "copy of the entire web" part

This is similar to what happened with Chinese reverse engineering of American manufacturing or PC clones killing IBM PCs.

Is it in the interest of the USA, probably no, that's why I assume this will be backed by law eventually

messe 5 hours ago | parent [-]

> Distilling a model is a method that can push the entire market to low margins and prevent companies from making money off such research

Then it's on Anthropic to actually price their models accordingly so that distilling isn't profitable. Why does this need a legal remedy when market forces could easily resolve this?

> Is it in the interest of the USA, probably no

Good. The world needs to diversify away from dependence on US technology.

breppp 4 hours ago | parent [-]

> Good. The world needs to diversify away from dependence on US technology.

In my opinion further strengthening the CCP is a disaster for the world. A government that killed millions of its own citizens to stay in power is not who I would entrust super intelligence with. But apparently we are not going to agree on that

vrganj 4 hours ago | parent | next [-]

When did the CCP kill millions of its own citizens to stay in power?

breppp 2 hours ago | parent [-]

The Great Leap Forward and the Cultural Revolution are two such examples

Generally Communist nations historically favored technological development to human life in the scale of millions, keep that in mind when we enter a new economic revolution

vrganj 2 hours ago | parent [-]

The Great Leap Forward wasn't "killing" people, which implies intent. It was just good old economic mismanagement.

On a related note, around 300k people die in the US every year due to causes directly attributable to poverty. [0]

In other words, ~a million every three years.

Now what?

[0] https://pmc.ncbi.nlm.nih.gov/articles/PMC10111231/

breppp an hour ago | parent [-]

> The Great Leap Forward wasn't "killing" people, which implies intent. It was just good old economic mismanagement.

If both the USSR and the CCP had millions killed in the process of modernization, without stopping when knowing the death toll, maybe there's intent after all?

How would you describe the cultural revolution then? another case of economic mismanagement?

vrganj an hour ago | parent [-]

I noticed you haven't addressed my main point at all. What are the millions dying of poverty every few years in the US (in a country with like a quarter of the population!), a death toll that still hasn't been stopped?

Is there intent there as well?

tw1984 3 hours ago | parent | prev [-]

40 years ago, when the CCP was leading its people making toys and socks for the US, people like you who never made any change to the world were talking such ideological nonsense.

40 years on, when the CCP is leading its people making AI, robotics, drones, EVs, space station and moon rovers to compete with the US, people like you how never made any change to the world are talking such ideological nonsense.

you live in a history museum or something like that?

breppp 2 hours ago | parent [-]

I don't know about me effecting change to the world but I am sure the tens of millions that died due to the Great Leap Forward were happy to effect change to the world so others could produce those socks

realusername 4 hours ago | parent | prev [-]

> Search engines for example historically ignored copyright law by copying excerpts or serving other site images, it doesn't mean someone copying Google's code has some moral frepass

Not sure that's the best example as they lost that battle and had to pay, eventually it's been codified in law in most countries.

vrganj 7 hours ago | parent | prev [-]

Anthropic clearly doesn't respect other people's IP, it's real rich that they now insist on theirs being worthy of protection.

Fwiw, I think the concept of IP in general is counter to human progress.

wqaatwt 39 minutes ago | parent | next [-]

> in general is counter to human progress.

Historically most evidence seems to point to the contrary.

Amongst other things after the printing press was created it was impossible for anyone who was an author to survive from their work unless they were independently wealthy or had rich patrons.

kataklasm 7 hours ago | parent | prev | next [-]

The practical implementation of IP? Sure, that's debatable. But the concept of IP is rooted in favoring progress. The thought process being, that if one's intellectual work can be copied and reused and modified and what not without issues, why should anyone invent things anymore? Just wait for the next person to do it and then copy their work, that's way less effort than inventing things yourself. IP aims to protect progress by making sure inventors have actual incentive to invent stuff. They way it's implemented is fundamentalst flawed, I agree, but the concept itself? I'm not so clear on that

vrganj 4 hours ago | parent | next [-]

The Soviet Union, for all it's faults, had a fair bunch of scientific and technological breakthroughs without relying on IP.

Sure, one person gets rewarded more with the IP system. But at the same time, that breakthrough then can't be built upon by others.

Overall, I think it does more harm than good because of how it monopolizes technologies and ossifies development.

I think free sharing of knowledge will always beat intellectual stinginess.

wqaatwt 32 minutes ago | parent [-]

> fair bunch of scientific and technological breakthroughs

Outside of military technologies they had massively fallen behind the west by the 80s. Without the western tech they licensed or copied they were permanently stuck in the 50s. Even their crappy cars were licensed copies of cheap European cars from the 60s.

When it comes to consumer electronics, vehicles and a bunch of other things they were comically behind. So it’s really not a good example..

> monopolizes technologies and ossifies development.

As bad as it might be empirical evidence shows that historically a superior system has never existed (it might be feasible but everything that was tried underperformed).

shimman 2 hours ago | parent | prev [-]

What absolute bollocks. Human ingenuity and innovation is only limited by the greed of elites, not due to something as damaging as "IP."

Good grief. All one has to do is look at how humanity has consistently progressed due iterating on what has existed is how we progress, not whether some corporation that wants to rat fuck us all for a few pts in share value.

wqaatwt 29 minutes ago | parent [-]

> progressed due iterating on what has existed is how we progress

Progress was extremely slow until the 1800s. Coincidentally corporation and modern capitalism in general developed around the same time. Of course I’m not necessarily saying it was the main or direct course since it isn’t exactly possible to create an experiment comparing it to other systems (of course that was tried an failed completely in the USSR, Maoist China and similar places)

breppp 7 hours ago | parent | prev [-]

It's more complicated than that because Google has been legally displaying other people copyrighted material for years.

In any case there's still a difference between publicly available copyrighted data and whether you can use it for model training, and the innovation around model training, RLHF, etc which you presumably have some interest as a country to allow companies to invest in with some legal protections (like the diff between patent law vs copyright law)

wqaatwt 27 minutes ago | parent | next [-]

LLM output is not copyrightable, though? So effectively if you pay for it you can do whatever you want from it. That seems perfectly fair and reasonable.

platinumrad 6 hours ago | parent | prev [-]

So you're saying it's more important to safeguard slop outputs than the original work of human beings.

breppp 5 hours ago | parent [-]

No, I am saying that there is a good chance that for the good of humanity, society decides that for miracle AGI we collectively forfeit copyright in LLM training yet IP protections for model development is still kept.

There are many cases in the early 2000s were copyright protections were relaxed for tech advancements

jdgoesmarching 3 hours ago | parent | next [-]

“For the good of humanity we must protect what I’m working on at the expense of others because it’s super important.”

As frustrating as the anti-AI crowd can be, I see why they end up that way when the valley is full of opinions like this.

Barbing 4 hours ago | parent | prev | next [-]

Does this match the kind of eminent domain case we might see where the country needs a highway more than it needs one particular citizen's house?

When they bulldoze the house to pave the highway, they toss the homeowner a few bucks. If you take an author’s books do you owe him a share of OpenAI?

close04 3 hours ago | parent | prev | next [-]

What are you forfeiting for the good of humanity? Would you give up a big chunk of your income? What happens when this batch of “innovators” don’t deliver AGI and only enrich themselves? What happens if they do deliver AGI and (hypothetically) still keep it to themselves?

You come with the selfless proposal that everyone give to the poor $tn companies”for the good of humanity”. I’ll assume this is just hopelessly naive but you post so insistently that it makes me wonder.

vrganj 2 hours ago | parent | prev [-]

Have you at tried asking society how they feel about you acting "for their good"? Because popular sentiment seems pretty opposed to AI.