Remix.run Logo
shakna 2 days ago

There's two parts here.

The first:

> it was a fair use because all Anthropic did was replace the print copies it had purchased for its central library

It is only fair use where Anthropic had already purchased a license to the work. Which has zero to do with scraping - a purchase was made, an exchange of value, and that comes with rights.

The second, which involves a section of the judgement a little before your quote:

> And, as for any copies made from central library copies but not used for training, this order does not grant summary judgment for Anthropic.

This is where the court refused to make any ruling. There was no exchange of value here, such as would happen with scraping. The court made no ruling.

tpmoney 2 days ago | parent [-]

I believe you are misinterpreting the ruling. Remember that a copyright claim must inherently argue that copies of the work are being made. To that end, the case analyzes multiple "copies" alleged to have been made.

1) "Copies used to train specific LLMs", for which the ruling is:

> The copies used to train specific LLMs were justified as a fair use.

> Every factor but the nature of the copyrighted work favors this result.

> The technology at issue was among the most transformative many of us will see in our lifetimes.

Notable here is that all of the "copies used to train specific LLMs" were copies made from books Anthropic purchased. But also of note is that Anthropic need not have purchased them, as long as they had obtained the original sources legally. The case references the Google Books lawsuit as an example of something Anthropic could have done to avoid pirating the books they did pirate where in Google obtained the original materials on loan from willing and participating libraries, and did not purchase them.

2) "The copies used to convert purchased print library copies into digital library copies", where again the ruling is:

> justified, too, though for a different fair use. The first factor strongly

> favors this result, and the third favors it, too. The fourth is neutral. Only

> the second slightly disfavors it. On balance, as the purchased print copy was

> destroyed and its digital replacement not redistributed, this was a

> fair use.

Here one might argue where the use of GPL code is different in that in making the copy, no original was destroyed. But it's also very likely that this wouldn't apply at all in the case of GPL code because there was also no original physical copy to convert into a digital format. The code was already digitally available.

3) "The downloaded pirated copies used to build a central library" where the court finds clearly against fair use.

4) "And, as for any copies made from central library copies but not used for training" where as you note Judge Alsup declined to rule. But notice particularly that this is referring to copies made FROM the central library AND NOT for the purposes of training an LLM. The copies made from purchased materials to build the central library in the first place were already deemed fair use. And making copies from the central library to train an LLM from those copies was also determined to be fair use.The copies obtained by piracy were not. But for uses not pertaining to the training of an LLM, the judge is declining to make a ruling here because there was not enough evidence about what books from the central library were copied for what purposes and what the source of those copies was. As he says in the ruling:

> Anthropic is not entitled to an order blessing all copying “that Anthropic has ever made after obtaining the data,” to use its words

This declination applies both to the purchased and pirated sources, because it's about whether making additional copies from your central library copies (which themselves may or may not have been fair use), automatically qualifies as fair use. And this is perfectly reasonable. You have a right as part of fair use to make a copy of a TV broadcast to watch at a later time on your DVR. But having a right to make that copy does not inherently mean that you also have a right to make a copy from that copy for any other purposes. You may (and almost certainly do) have a right to make a copy to move it from your DVR to some other storage medium. You may not (and almost certainly do not) have a right to make a copy and give it to your friend.

At best, an argument that GPL software wouldn't be covered under the same considerations of fair use that this case considers would require arguing that the copies of GPL code obtained by Anthropic were not obtained legally. But that's likely going to be a very hard argument to make given that GPL code is freely distributed all over the place with no attempts made to restrict who can access that code. In fact, GPL code demands that if you distribute the software derived from that code, you MUST make copies of the code available to anyone you distribute the software to. Any AI trainer would simply need to download Linux or emacs and the GPL requires the person they downloaded that software from to provide them with the source code. How could you then argue that the original source from which copies were made was obtained illicitly when the terms of downloading the freely available software mandated that they be given a copy?

shakna 2 days ago | parent [-]

> How could you then argue that the original source from which copies were made was obtained illicitly when the terms of downloading the freely available software mandated that they be given a copy?

By the license and terms such copies are under.

> For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights.

You _must_ show the terms. If you copy the GPL code, and it inherits the license, as the terms say it does, then you must also copy the license.

The GPL does not give you an unfettered right to copy, it comes with terms and conditions protecting it under contract law. Thus, you must follow the contract.

The GPL goes to some lengths to define its terms.

> A "covered work" means either the unmodified Program or a work based on the Program.

> Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well.

It is not just the source code that you must convey.

tpmoney 21 hours ago | parent [-]

> By the license and terms such copies are under.

Which clause of the GPL requires the receiver of GPL code to agree to the terms of the GPL before being allowed to receive the source code that they are entitled to under the license? Because that would expressly contradict the first sentence of section 9:

    You are not required to accept this License in order to receive or run a copy of the Program.
Isn't that one of the key points to the GPL? That the provisions of it only apply to you IF you decide to distribute GPL software but that they do not impose any restrictions on the users of the software? Surely you're not suggesting that anyone who has ever seen the source code of a GPLed piece of software is permanently barred from contributing to or writing similar software under a non-GPL license simply by the fact that they received the GPLed source code.

> If you copy the GPL code, and it inherits the license, as the terms say it

> does, then you must also copy the license.

> The GPL does not give you an unfettered right to copy, it comes with terms

> and conditions protecting it under contract law. Thus, you must follow the > contract.

I agree that the GPL does not give you an unfettered right to copy. But the GPL like all such licenses are still governed by copyright law. And "fair use" is an exception to the copyright laws that allow you to make copies that you are not otherwise authorized to make. No publisher can put additional terms in their book, even if they wrap it in shrinkwrap that denies you the right to use that book for various fair use purposes like quoting it for criticism or parody. The Sony terms and conditions for the Play Station very clearly forbid copying the BIOS or decompiling it. But those terms are null and void when you copy the BIOS and decompile it for making a new emulator (at least in the US) because the courts have already ruled that such use is fair use.

So it is with the GPL. By default you have no right to make copies of the software at all. The GPL then grants you additional rights you normally wouldn't have under copyright law, but only to the extent that when exercising those rights, you comply with the terms of the GPL. But "Fair Use" then goes beyond that and says that for certain purposes, certain types and amounts of copies can be made, regardless of what rights the publisher does or does not reserve. This would be why the GPL specifically says:

    This License acknowledges your rights of fair use or other equivalent, as provided by copyright law.
Fair use (and its analogs in other countries) supersede the GPL. And even the GPL FAQ[1] acknowledges this fact:

    Do I have “fair use” rights in using the source code of a GPL-covered program? (#GPLFairUse)
    Yes, you do. “Fair use” is use that is allowed without any special 
    permission. Since you don't need the developers' permission for such use, you 
    can do it regardless of what the developers said about it—in the license or 
    elsewhere, whether that license be the GNU GPL or any other free software 
    license.
[1]: https://www.gnu.org/licenses/gpl-faq.en.html#GPLFairUse