Remix.run Logo
bigmadshoe 7 days ago

Buying used copies of books, scanning them, training an employee with the scans: fair use.

Unless legislation changes, model training is pretty much analogous to that. Now of course if the employee in question - or the LLM - regurgitates a copyrighted piece verbatim, that is a violation and would be treated accordingly in either case.

bink 7 days ago | parent | next [-]

> Buying used copies of books, scanning them, training an employee with the scans: fair use.

Does this still hold true if multiple employees are "trained" from scanned copies at the same time?

bigmadshoe 7 days ago | parent [-]

Simultaneously I guess that would violate copyright, which is an interesting point. Maybe there's a case to be made there with model training.

Regardless, the issue could be resolved by buying as many copies as you have concurrent model training instances. It isn't really an issue with training on copyrighted work, just a matter of how you do so.

arduanika 6 days ago | parent | prev [-]

Computers aren't people. And analogies aren't laws.

bigmadshoe 6 days ago | parent [-]

Yes, but the law doesn’t exist, so until it catches up, analogies are all the legal system has to work with.