Remix.run Logo
ben_w 2 days ago

> Freely available on the web doesn't mean it's in the Public Domain.

Doesn't need to be.

> The "lawfully obtained" part of your argument is patently untrue. You can legally obtain something, but that doesn't mean any use of it is automatically legal as well.

I didn't say "any" use, I said this specific use. Here's the quote from the judge who decided this:

  5. OVERALL ANALYSIS.
  After the four factors and any others deemed relevant are “explored, [ ] the results [are] weighed together, in light of the purposes of copyright.” Campbell, 510 U.S. at 578. The copies used to train specific LLMs were justified as a fair use. Every factor but the nature of the copyrighted work favors this result. The technology at issue was among the most transformative many of us will see in our lifetimes.
- https://storage.courtlistener.com/recap/gov.uscourts.cand.43...

> Otherwise, the recent Spotify dump by Anna's Archive would be legal as well.

I specifically said copyright infringement was separate. Because, guess what, so did the judge the next paragraph but one from the quote I just gave you.

> For instance, since the advent of LLM crawling, I've added the "No Derivatives" clause to the CC license of anything new I publish to the web. It's still freely accessible, can be shared on, etc., but it explicitly prohibits using it for training ML models. I even add an additional clause to that effect, should the legal interpretation of CC-ND ever change. In short, anyone training an LLM on my content is infringing my rights, period.

It will be interesting to see if that holds up in future court cases. I wouldn't bank on it if I was you.