Remix.run Logo
fnordpiglet an hour ago

It’s actually likely fair use on the way in as well. What’s not fair use is the production of copyright material with the model and the question is the extent to which model providers have to prevent it. These topics came up with the photocopier, VHS tapes, etc. The training side is more subtle because they are clearly unlicensed and used in the model but this is actually similar to taking a book and photocopying sections and using them for handouts and in training materials or other uses. The crucial part is they effectively destroy the original material in training and no where in the model is the copyright material, even if they can produce something similar when deliberately induced to do so. However you the user induced it, and depending on what you do with what you induced, you can violate the original copyright holder. (N.b., IANAL, but these are my summaries of discussing with a law professor at length who specializes in copyright, open source, etc)

Whether it’s moral or not to not remunerate everyone who produced the training material is of course important but a different question. I sort of agree with Sanders et al that Ai should be a public trust like the Alaskan oil reserves. But good luck.