Remix.run Logo
TremendousJudge a day ago

No, the exception they are asking for (we can train on copyrighted material and the image produced is non-copyright infringing) is copyright infringing in the most basic sense.

I'll prove it by induction: Imagine that I have a service where I "train" a model on a single image of Indiana Jones. Now you prompt it, and my model "generates" the same image. I sell you this service, and no money goes to the copyright holder of the original image. This is obviously infringment.

There's no reason why training on a billion images is any different, besides the fact that the lines are blurred by the model weights not being parseable

slidehero a day ago | parent [-]

>There's no reason why training on a billion images is any different

You gloss over this as if it's a given. I don't agree. I think you're doing a different thing when you're sampling billions of things equallly.

codedokode a day ago | parent | next [-]

The root problem is that the model reproduces Indiana Jones instead of creating a new character. This contradicts the statement that the model "learns" and "creates" like a human artist and not merely copies; obviously a human artist would not plagiarize when asked to draw a character.

chii 20 hours ago | parent | next [-]

> the model reproduces Indiana Jones

the model isn't the one infringing. It's the end user inputting the prompt.

The model itself is not a derivative work, in the same way that an artist and photoshop aren't a derivative work when they reproduce indiana jones's likeness.

codedokode 14 hours ago | parent [-]

The end user didn't ask for Indiana Jones though.

CaptainFever 21 hours ago | parent | prev [-]

That does not seem obvious at all. Fan art and referencing is a thing, and there are plenty of examples of AI creating characters that do not exist anywhere in the training dataset.

TremendousJudge 8 hours ago | parent | prev [-]

That's why I said it's an argument by induction. Where's the limit for it to be different? 10 images? 100? 10000? Where does it stop being copyright infringement and why? Many people have paid heavy fines for much less. I don't think that "a billion images is so unfathomable compared to just one million that it truly is a difference in kind" is a valid response