Remix.run Logo
CaptainFever 8 hours ago

Also, is there really any benefit to stripping author metadata? Was it basically a preprocessing step?

It seems to me that it shouldn't really affect model quality all that much, is it?

Also, in the amended complaint:

> not to notify ChatGPT users when the responses they received were protected by journalists’ copyrights

Wasn't it already quite clear that as long as the articles weren't replicated, it wasn't protected? Or is that still being fought in this case?

In the decision:

> I agree with Defendants. Plai ntiffs allege that ChatGPT has been trained on "a scrape of most of the internet, " Compl. , 29, which includes massive amounts of information from innumerable sources on almost any given subject. Plaintiffs have nowhere alleged that the information in their articles is copyrighted, nor could they do so . When a user inputs a question into ChatGPT, ChatGPT synthesizes the relevant information in its repository into an answer. Given the quantity of information contained in the repository, the likelihood that ChatGPT would output plagiarized content from one of Plaintiffs' articles seems remote. And while Plaintiffs provide third-party statistics indicating that an earlier version of ChatGPT generated responses containing signifi cant amounts of pl agiarized content, Compl. ~ 5, Plaintiffs have not plausibly alleged that there is a " substantial risk" that the current version of ChatGPT will generate a response plagiarizing one of Plaintiffs' articles.

freejazz 8 hours ago | parent [-]

>Also, is there really any benefit to stripping author metadata? Was it basically a preprocessing step?

Have you read 1202? It's all about hiding your infringement.