Remix.run Logo
immibis 4 days ago

This is also a good opportunity to remember that MIT is not a strong enough open source license, and if you want to prevent corporations making money off your work, make it AGPL or even SSPL, plus a statement that AI training creates a derivative work (the latter may or may not have any legal effect).

MIT is a donation of your labour to corporations. With a stronger license, at least they're more likely to contribute back or to pay you for a looser license.

pythonaut_16 4 days ago | parent | next [-]

Alternatively MIT does exactly what it says it does. It's up to you as an author whether you like those terms or if you'd prefer GPL, AGPL, or SSPL.

If you want a permissive license MIT is perfectly reasonable. If you want more restrictions or stronger copy-left then don't pick MIT.

rollcat 4 days ago | parent | next [-]

As far as I was able to tell, every single coding LLM out there still violates the terms of the MIT license, because the license requires attribution - and LLMs rarely (if ever?) provide any.

zetanor 4 days ago | parent [-]

I've not used AI to program and have very little interest in using AI to program, but I fail to see how laundering code through massive probabilistic lossy compression (silicon) should be treated any differently than laundering code through massive probabilistic lossy compression (biological). Should humans have to keep track of which software codebases they learn each pattern from, too?

rollcat 2 days ago | parent | next [-]

Calling humans massive probabilistic lossy compressors is an insult to curiosity, creativity, compassion, and any number of other traits that push us to advance technology. We've invented everything from Babbage&Ada's, vacuum tubes, punch cards, to GPUs.

Code regurgitators can't even design a coherent API.

ranger_danger 3 days ago | parent | prev [-]

My understanding is this was part of the reasoning for a certain US court to rule that AI art is (at least in a default sense) fair use. You're right, both humans and AI "create" things by using things we have seen before... some say art itself can only ever be the sum of our past influences.

immibis 4 days ago | parent | prev [-]

The point is that people who think they want permissive licenses usually don't, and eventually regret choosing them when a corporation treats their work as donated labour (because it is), assuming their software is important enough to be picked up by them (if not then license choice doesn't matter anyway).

bigstrat2003 4 days ago | parent | prev | next [-]

> MIT is a donation of your labour to corporations.

No, MIT is a donation of your labor to the public. That includes corporations, yes, but it is not only corporations.

ranger_danger 4 days ago | parent [-]

I always found this stance puzzling. If the point of open source is to give your code to the public, why do people get upset when corporations do exactly what you told them they could do?

If you didn't want to give it to everyone, you shouldn't have chosen that license.

And if you choose a non-commercial license, people get upset that it's "not technically open source because the OSI says so" as if they are somehow the arbiter of this (or even should be). It's not like anyone owns the trademark to the term "open source" for software either.

Ironically, I've seen a lot of people in the last several years quit open source entirely and/or switch to closed source.

Alupis 4 days ago | parent [-]

> why do people get upset when corporations do exactly what you told them they could do?

A lot of people have been taught `corporations == bad`, part of the anti-capitalism efforts taught to our youth for a couple generations.

ranger_danger 4 days ago | parent | next [-]

Yes I understand... but they already knew that the license explicitly allows this, and they already knew companies regularly take advantage of FOSS without giving back, so I'm not sure why they were expecting to get lucky or something.

To me this is just like getting upset when someone forks your open source project. Which ironically I've seen happen a LOT. Sometimes the original developer/team even quits when that happens.

It's like... they don't actually want it to be open source, they want to be the ONLY source.

immibis 4 days ago | parent [-]

Because they don't think about it deeply - that's why reminders are necessary. They think they're only donating to people with similar attitudes to themselves. xGPL licenses (SSPL included) are the license family most similar to that...

... but MIT is what corporations told them they want. There has been a low-level but persistent campaign against xGPL in the past several years and the complaints always trace back to "the corporation I work for doesn't like xGPL." No individual free software developer has a problem with xGPL (SSPL not included).

ranger_danger 4 days ago | parent [-]

> No individual free software developer has a problem with xGPL

I do... I consider it the opposite of freedom. I think it places severe restrictions on your project that make it hard/impossible for some people (like companies) to use, especially if your project contains lots of code from other people that make it really hard/impossible to try to re-license if one day you decide you like/need money (assuming you have no CLA, I don't like those either).

But I also realize there's different kinds of freedom... freedom TO vs freedom FROM.

Some want the freedom TO do whatever they want... and others want freedom FROM the crazy people doing whatever they want.

I wish there was a happy medium but centrism doesn't seem to be very popular these days.

immibis 4 days ago | parent [-]

Which part of the GPL do you consider to be a "severe restriction" that "makes your project impossible to use"?

I agree that you can't legally take a bunch of GPL code and relicense it as proprietary. That's the point.

Freedom to/from is a false dichotomy; most rights can be expressed equivalently in either "to" or "from" form.

nurettin a day ago | parent | prev [-]

It is not conspiracy, it is human nature.

Bernard shaw put it best:

    If at age 20 you are not a Communist then you have no heart. If at age 30 you are not a Capitalist then you have no brains.
kannanvijayan 4 days ago | parent | prev | next [-]

Tangentially, I wonder if logins and click-throughs can help address this on the legal front.

If you set up a login flow with a click through that explicitly sets the terms of access, specifying no cost for access by a person, and some large cost for access by AI.

Stepping past this prompt into the content would require an AI to either lie, committing both fraud and unauthorized access of content.. or behave truthfully, opting in the proprietor of the API to the associated costs.

In either case, the site operator can then go after the company doing the scraping to collect the fees as specified in the copyright contract (and perhaps some additional delta of punitive fines if content was accessed fraudulently).

fouronnes3 4 days ago | parent | prev | next [-]

When are we getting a GPLv4 that's AGPL + no LLM training? This is overdue.

Octoth0rpe 4 days ago | parent | next [-]

Given Meta's history of torrenting every book it could get its hands on for training, I'm not convinced that the majority of AI companies would respect that license. Maybe if we also had a better way to prove that such code was part of the training set and see a couple of solid legal victories with compensation awarded.

bayindirh 4 days ago | parent | next [-]

I'm pretty astounded that "The Stack" at least did and effort, and continue to do so by weeding out GPL or similar strong copyleft source code from their trove, and even implemented an opt-out mechanism [0].

They look like saints when compared to today's companies.

[0]: https://huggingface.co/spaces/bigcode/in-the-stack

immibis 4 days ago | parent | prev [-]

They're also getting sued for it, and the judge ruled they had no right to torrent those books so now it's just a matter of calculating how many trillions Meta has to pay, then extracting it from them.

Octoth0rpe 4 days ago | parent [-]

Because Meta got caught. I'm not convinced that every random OSS lib will have the resources to audit every model out there for a hypothetical GPL+no training violation.

ramses0 4 days ago | parent | prev | next [-]

"Adversarial Internet" => if it touches the internet it's no longer yours. See a previous comment chain: https://news.ycombinator.com/item?id=44616163

ToucanLoucan 4 days ago | parent [-]

> if it touches the internet it's no longer yours

*Unless you're a member of the capital class, in terms of being a corporation or a wealthy individual, who can then make our two-tiered justice system work for you. As Disney is seemingly looking to do. Then it will absolutely work for you.

This is why I and people like me so often say "there is no war but the class war." Arguing about copyright misses the entire point: The law serves the large stakeholders in the system, not the people. The only thing that's changed is there is now a large stakeholder of whom a core pillar of their ongoing business is the theft of data at industrial scale which happens to include data of other large stakeholders which is why we're now seeing the slap fight.

By all means enjoy it, it's very entertaining watching these people twist themselves into knots to explain why it's okay for Nintendo to sue people into the ground for distributing copies of games they no longer sell in any capacity but simultaneously it's okay for OpenAI to steal absolutely goddamned everything on the grounds that nothing has been "really" taken due to being infinitely replicable, or because it's a public research org, or whatever flimsy excuse is being employed at the time.

As it has been from the beginning, my position is: whatever the rule we decide on, it should apply to everyone. A very simple statement on very basic ethics that seems to make a lot of people very angry for some reason.

ramses0 4 days ago | parent [-]

"The law, in its majestic equality, forbids rich and poor alike to sleep under bridges, to beg in the streets, and to steal their bread." Anatole France

pjerem 4 days ago | parent | prev | next [-]

Like if LLM training cared about respecting licenses. :(

ranger_danger 4 days ago | parent | prev [-]

Be the change you wish to see.

Or just literally call your program's license "AGPL + no LLM training" and that may suffice.

immibis 4 days ago | parent [-]

the AGPL says that you can ignore any restrictions the author tried to impose on you, that's why you frame it as LLM training is already violating the AGPL by making a derivative work.

Alupis 4 days ago | parent | prev [-]

> MIT is a donation of your labour to corporations.

Unless you are willing to spend yourself into financial ruin pursuing legal action against some faceless megacorp - it literally doesn't matter what license you use.

I've lived enough to know there is "what should be" and then there is what actually happens in reality. We don't live in a reality where everyone just does things out of the goodness of their heart...

Adding some text to your project, hosted on a public website for all to see means some people will take your code regardless of the license or your intent - and, realistically, what are you going to do about it? Nothing...

So... please, let's get off this GPL high-horse. It's not some end-all-be-all holy text that solves all of the world's problems.