▲ | amradio1989 7 days ago | ||||||||||||||||||||||
I think the jury is still out on how fair use applies to AI. Fair use was not designed for what we have now. I could read a book, but its highly unlikely I could regurgitate it, much less months or years later. An LLM, however, can. While we can say "training is like reading", its also not like reading at all due to permanent perfect recall. Not only does an LLM have perfect recall, it also has the ability to distribute plagiarized ideas at a scale no human can. There's a lot of questions to be answered about where fair use starts/ends for these LLM products. | |||||||||||||||||||||||
▲ | dns_snek 7 days ago | parent | next [-] | ||||||||||||||||||||||
Fair use wasn't designed for AI, but AI doesn't change the motivations and goals behind copyright. We should be returning back to the roots - why do we have copyright in the first place, what were the goals and the intent behind it, and how does AI affect them? The way this technology is being used clearly violates the intent behind copyright law, it undermines its goals and results in harm that it was designed to prevent. I believe that doing this without extensive public discussion and consensus is anti-democratic. We always end up discussing concrete implementation details of how copyright is currently enforced, never the concept itself. Is there a good word for this? Reification? | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | stickfigure 7 days ago | parent | prev | next [-] | ||||||||||||||||||||||
> Not only does an LLM have perfect recall This has not been my experience. These days they are pretty good at googling though. | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | heavyset_go 7 days ago | parent | prev | next [-] | ||||||||||||||||||||||
> I could read a book, but its highly unlikely I could regurgitate it, much less months or years later. And even if one could, it would be illegal to do. Always found this argument for AI data laundering weird. | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | tpmoney 6 days ago | parent | prev | next [-] | ||||||||||||||||||||||
> I think the jury is still out on how fair use applies to AI. The judge presiding over this case has already issued a ruling to the effect that training an LLM like Anthropic's AI with legally acquired material is in fact fair use. So unless someone comes up with some novel claims that weren't already attempted, claims that a different form of AI is significantly different from a copyright perspective from an LLM or tries their hand in a different circuit to get a split decision, the "jury" is pretty much settled on how fair use applies to AI. Legally acquired material used to train LLMs is fair use. Illegally obtaining copies of material is not fair use, and the transformative nature of LLMs don't retroactively make it fair use. | |||||||||||||||||||||||
▲ | Ekaros 7 days ago | parent | prev | next [-] | ||||||||||||||||||||||
One more fundamental difference. I can't read all of the books and then copy my brain. Which is one fundamental things how copyright is handled. Copying in general or performing multiple times. So I can accept argument that training model onetime and then using singular instance of that model is analogues to human learning. But when you get to running multiple copies of model, we are clearly past that. | |||||||||||||||||||||||
▲ | prewett 6 days ago | parent | prev [-] | ||||||||||||||||||||||
I find the LLM on Google's search regularly regurgitates StackOverflow and Quora answers practically verbatim. |