Remix.run Logo
afro88 7 hours ago

Same as if a regular person did the same. They are responsible for it. If you're using AI, check the code doesn't violate licenses

rzmmm 5 hours ago | parent | next [-]

In certain law cases plagiarization can be influenced by the fact if person is exposed to the copyrighted work. AI models are exposed to very large corpus of works..

cxr 5 hours ago | parent [-]

Copyright infringement and plagiarism are not the same or even very closely related. They're different concepts and not interchangeable. Relative to copyright infringement, cases of plagiarism are rarely a matter for courts to decide or care about at all. Plagiarism is primarily an ethical (and not civil or criminal) matter. Rather than be dealt with by the legal system, it is the subject of codes of ethics within e.g. academia, journalism, etc. which have their own extra-judicial standards and methods of enforcement.

dekhn 4 hours ago | parent [-]

I suspect they were instead referring to patents; for example, when I worked at Google, they told the engineers not to read patents because then the engineer might invent something infringing, I think it's called willful infringement. No other employer I've worked for has every raised this as an issue, while many lawyers at google would warn against this.

martin-t 6 hours ago | parent | prev | next [-]

As opposed to an irregular person?

LLMs are not persons, not even legal ones (which itself is a massive hack causing massive issues such as using corporate finances for political gain).

A human has moral value a text model does not. A human has limitations in both time and memory available, a model of text does not. I don't see why comparisons to humans have any relevance. Just because a human can do something does not mean machines run by corporations should be able to do it en-masse.

The rules of copyright allow humans to do certain things because:

- Learning enriches the human.

- Once a human consumes information, he can't willingly forget it.

- It is impossible to prove how much a human-created intellectual work is based on others.

With LLMs:

- Training (let's not anthropomorphize: lossily-compressing input data by detecting and extracting patterns) enriches only the corporation which owns it.

- It's perfectly possible to create a model based only on content with specific licenses or only public domain.

- It's possible to trace every single output byte to quantifiable influences from every single input byte. It's just not an interesting line of inquiry for the corporations benefiting from the legal gray area.

sarchertech 7 hours ago | parent | prev [-]

How could you do that though? You can’t guarantee that there aren’t chunks of copied code that infringes.

Andrex 6 hours ago | parent | next [-]

Let me introduce you to the concept of submarine patents...

shevy-java 6 hours ago | parent | prev [-]

But the responsible party is still the human who added the code. Not the tool that helped do so.

aargh_aargh 6 hours ago | parent | next [-]

The practical concern of Linux developers regarding responsibility is not being able to ban the author, it's that the author should take ongoing care for his contribution.

Cytobit 6 hours ago | parent | prev | next [-]

That's not going to shield the Linux organization.

cxr 4 hours ago | parent [-]

A DCO bearing a claim of original authorship (or assertion of other permitted use) isn't going to shield them entirely, but it can mitigate liability and damages.

sarchertech 2 hours ago | parent [-]

Can it though? As far as I know this hasn’t been tested.

30 minutes ago | parent [-]
[deleted]
sarchertech 6 hours ago | parent | prev [-]

In a court case the responsibility party very well could be the Linux foundation because this is a foreseeable consequence of allowing AI contributions. There’s no reasonable way for a human to make such a guarantee while using AI generated code.

Chance-Device 6 hours ago | parent | next [-]

It’s not about the mechanism: responsibility is a social construct, it works the way people say that it works. If we all agree that a human can agree to bear the responsibility for AI outputs, and face any consequences resulting from those outputs, then that’s the whole shebang.

sarchertech 6 hours ago | parent | next [-]

Sure we could change the law. It would be a stupid change to allow individuals, organizations, and companies to completely shield themselves from the consequences of risky behaviors (more than we already do) simply by assigning all liability to a fall guy.

4 hours ago | parent | next [-]
[deleted]
Chance-Device 6 hours ago | parent | prev | next [-]

What law exactly are you suggesting needs to be changed? How is this any different from what already happens right now, today?

sarchertech 6 hours ago | parent [-]

Right now it's very easy not to infringe on copyrighted code if you write the code yourself. In the vast majority of cases if you infringed it's because you did something wrong that you could have prevented (in the case where you didn't do anything wrong, inducement creation is an affirmative defense against copyright infringement).

That is not the case when using AI generated code. There is no way to use it without the chance of introducing infringing code.

Because of that if you tell a user they can use AI generated code, and they introduce infringing code, that was a foreseeable outcome of your action. In the case where you are the owner of a company, or the head of an organization that benefits from contributors using AI code, your company or organization could be liable.

Chance-Device 5 hours ago | parent | next [-]

It’s a foreseeable outcome that humans might introduce copyrighted code into the kernel.

I think you’re looking for problems that don’t really exist here, you seem committed to an anti AI stance where none is justified.

sarchertech 5 hours ago | parent [-]

A human has to willingly violate the law for that to happen though. There is no way for a human to use AI generated that doesn't have a chance of producing copyrighted code though. That's just expected.

If you don't think this is a problem take a look at the terms of the enterprise agreements from OpenAI and Anthropic. Companies recognize this is an issue and so they were forced to add an indemnification clause, explicitly saying they'll pay for any damages resulting in infringement lawsuits.

johnisgood 2 hours ago | parent | prev | next [-]

> Right now it's very easy not to infringe on copyrighted code if you write the code yourself.

Humans routinely produce code similar to or identical to existing copyrighted code without direct copying.

sarchertech 2 hours ago | parent [-]

They don’t produce enough similar code to infringe frequently. And if they did independent creation is an affirmative defense to copyright infringement that likely doesn’t apply to LLMs since they have the demonstrated capability to produce code directly from their training set.

johnisgood 2 hours ago | parent [-]

You have shifted from "very easy not to infringe" to "don't infringe frequently", which concedes the original point that humans can and do produce infringing code without intent.

On independent creation: you are conflating the tool with the user. The defense applies to whether the developer had access to the copyrighted work, not whether their tools did. A developer using an LLM did not access the training set directly, they used a synthesis tool. By your logic, any developer who has read GPL code on GitHub should lose independent creation defense because they have "demonstrated capability to produce code directly from" their memory.

LLM memorization/regurgitation is a documented failure mode, not normal operation (nor typical case). Training set contamination happens, but it is rare and considered a bug. Humans also occasionally reproduce code from memory: we do not deny them independent creation defense wholesale because of that capability!

In any case, the legal question is not settled, but the argument that LLM-assisted code categorically cannot qualify for independent creation defense creates a double standard that human-written code does not face.

4 hours ago | parent | prev [-]
[deleted]
bpt3 6 hours ago | parent | prev [-]

In this case, the "fall guy" is the person who actually introduced the code in question into the codebase.

They wouldn't be some patsy that is around just to take blame, but the actual responsible party for the issue.

sarchertech 5 hours ago | parent [-]

Imagine your a factory owner and you need a chemical delivered from across the country, but the chemical is dangerous and if the tanker truck drives faster than 50 miles per hour it has a 0.001% chance per mile of exploding.

You hire an independent contractor and tell him that he can drive 60 miles per hour if he wants to but if it explodes he accepts responsibility.

He does and it explodes killing 10 people. If the family of those 10 people has evidence you created the conditions to cause the explosion in order to benefit your company, you're probably going to lose in civil court.

Linus benefits from the increase velocity of people using AI. He doesn't get to put all the liability on the people contributing.

bpt3 2 hours ago | parent [-]

That is a nonsensical analogy on multiple levels, and doesn't even support your own argument.

sarchertech 2 hours ago | parent [-]

Nice rebuttal.

bpt3 2 hours ago | parent [-]

Why would I put much effort into responding to a post like yours, which makes no sense and just shows that you don't understand what you're talking about?

lo_zamoyski 4 hours ago | parent | prev [-]

Responsibility is an objective fact, not just some arbitrary social convention. What we can agree or disagree about is where it rests, but that's a matter of inference, an inference can be more or less correct. We might assign certain people certain responsibilities before the fact, but that's to charge them with the care of some good, not to blame them for things before they were charged with their care.

bitwize 4 hours ago | parent | prev [-]

Because contributions to Linux are meticulously attributed to, and remain property of, their authors, those authors bear ultimate responsibility. If Fred Foobar sends patches to the kernel that, as it turns out, contain copyrighted code, then provided upstream maintainers did reasonable due diligence the court will go after Fred Foobar for damages, and quite likely demand that the kernel organization no longer distribute copies of the kernel with Fred's code in it.

sarchertech 2 hours ago | parent [-]

Anyone distributing infringing material can be liable, and it’s unlikely that this technicality will actually would shield anyone.

Anyone who thinks they have a strong infringement case isn’t going to stop at the guy who authored the code, they’re going to go after anyone with deep pockets with a good chance of winning.