| ▲ | sharkjacobs 3 days ago |
| > On June 23, 2025, the Court rendered its Order on Fair Use, Dkt. 231, granting Anthropic’s motion for summary judgment in part and denying its motion in part. The Court reached different conclusions regarding different sources of training data. It found that reproducing purchased and scanned books to train AI constituted fair use. Id. at 13-14, 30–31. However, the Court denied summary judgment on the copyright infringement claims related to the works Anthropic obtained from Library Genesis and Pirate Library Mirror. Id. at 19, 31. https://www.documentcloud.org/documents/26084996-proposed-an... > reproducing purchased and scanned books to train AI constituted fair use |
|
| ▲ | greensoap 3 days ago | parent | next [-] |
| Actually, the court really only said downloading a pirated book to store in your "library" was bad. The opinion is intentionally? ambiguous on whether the decision regarding copies used to train an LLM applies only to scanned books or also to pirated books. The facts found in the case are the training datasets were made from the "library" copies of books that included scans and pirated downloads. And the court said the training copies were fair use. The court also said the scanned library copies were fair use. The court found that the pirated library copies was not fair use. The court did not say for certain whether the pirated training copies were fair use. |
|
| ▲ | thaumasiotes 3 days ago | parent | prev [-] |
| The usual analysis was that when you download a book from Library Genesis, that is an instance of copyright infringement committed by Library Genesis. This ruling appears to reverse that analysis. |
| |
| ▲ | papercrane 3 days ago | parent [-] | | Do you have a source for that because MAI Systems Corp. v. Peak Computer, Inc established that even creating a copy in RAM is considered a "copy" under the Copyright Act and can be infringement. | | |
| ▲ | parineum 3 days ago | parent [-] | | It's not an issue of where it's being copied, it's who's doing the copying. Library Genesis has one copy. It then sends you one copy and keeps it's own. The entity that violated the _copy_right is the one that copied it, not the one with the copy. | | |
| ▲ | masfuerte 3 days ago | parent [-] | | There are many copies made as the text travels from Library Genesis to Anthropic. This isn't just of theoretical interest. English law has specific copyright exemptions for transient copies made by internet routers, etc. It doesn't have exemptions for the transient copies made by end users such as Anthropic, and they are definitely infringing. Of course, American law is different. But is it the case that copies made for the purpose of using illegally obtained works are not infringing? | | |
| ▲ | thaumasiotes 3 days ago | parent [-] | | > But is it the case that copies made for the purpose of using illegally obtained works are not infringing? Well, the question here is "who made the copy?" If you advertise in seedy locations that you will send Xeroxed copies of books by mail order, and I order one, and you then send me the copy I ordered, how many of us have committed a copyright violation? | | |
| ▲ | masfuerte 3 days ago | parent [-] | | Copyright law is literally about the copies. A xeroxed book is exactly one copy. Mailing and reading that book doesn't copy it any further. In contrast, you can't do anything with digital media without making another copy. > "Who made the copy?" This begs the question. With digital media everybody involved makes multiple copies. |
|
|
|
|
|