Remix.run Logo
fouronnes3 4 days ago

When are we getting a GPLv4 that's AGPL + no LLM training? This is overdue.

Octoth0rpe 4 days ago | parent | next [-]

Given Meta's history of torrenting every book it could get its hands on for training, I'm not convinced that the majority of AI companies would respect that license. Maybe if we also had a better way to prove that such code was part of the training set and see a couple of solid legal victories with compensation awarded.

bayindirh 4 days ago | parent | next [-]

I'm pretty astounded that "The Stack" at least did and effort, and continue to do so by weeding out GPL or similar strong copyleft source code from their trove, and even implemented an opt-out mechanism [0].

They look like saints when compared to today's companies.

[0]: https://huggingface.co/spaces/bigcode/in-the-stack

immibis 4 days ago | parent | prev [-]

They're also getting sued for it, and the judge ruled they had no right to torrent those books so now it's just a matter of calculating how many trillions Meta has to pay, then extracting it from them.

Octoth0rpe 4 days ago | parent [-]

Because Meta got caught. I'm not convinced that every random OSS lib will have the resources to audit every model out there for a hypothetical GPL+no training violation.

ramses0 4 days ago | parent | prev | next [-]

"Adversarial Internet" => if it touches the internet it's no longer yours. See a previous comment chain: https://news.ycombinator.com/item?id=44616163

ToucanLoucan 4 days ago | parent [-]

> if it touches the internet it's no longer yours

*Unless you're a member of the capital class, in terms of being a corporation or a wealthy individual, who can then make our two-tiered justice system work for you. As Disney is seemingly looking to do. Then it will absolutely work for you.

This is why I and people like me so often say "there is no war but the class war." Arguing about copyright misses the entire point: The law serves the large stakeholders in the system, not the people. The only thing that's changed is there is now a large stakeholder of whom a core pillar of their ongoing business is the theft of data at industrial scale which happens to include data of other large stakeholders which is why we're now seeing the slap fight.

By all means enjoy it, it's very entertaining watching these people twist themselves into knots to explain why it's okay for Nintendo to sue people into the ground for distributing copies of games they no longer sell in any capacity but simultaneously it's okay for OpenAI to steal absolutely goddamned everything on the grounds that nothing has been "really" taken due to being infinitely replicable, or because it's a public research org, or whatever flimsy excuse is being employed at the time.

As it has been from the beginning, my position is: whatever the rule we decide on, it should apply to everyone. A very simple statement on very basic ethics that seems to make a lot of people very angry for some reason.

ramses0 4 days ago | parent [-]

"The law, in its majestic equality, forbids rich and poor alike to sleep under bridges, to beg in the streets, and to steal their bread." Anatole France

pjerem 4 days ago | parent | prev | next [-]

Like if LLM training cared about respecting licenses. :(

ranger_danger 4 days ago | parent | prev [-]

Be the change you wish to see.

Or just literally call your program's license "AGPL + no LLM training" and that may suffice.

immibis 4 days ago | parent [-]

the AGPL says that you can ignore any restrictions the author tried to impose on you, that's why you frame it as LLM training is already violating the AGPL by making a derivative work.