| ▲ | cogman10 16 hours ago | |||||||||||||||||||||||||||||||||||||
I get the feeling, but that's not what this is. NYTimes has produced credible evidence that OpenAI is simply stealing and republishing their content. The question they have to answer is "to what extent has this happened?" That's a question they fundamentally cannot answer without these chat logs. That's what discovery, especially in a copyright case, is about. Think about it this way. Let's say this were a book store selling illegal copies of books. A very reasonable discovery request would be "Show me your sales logs". The whole log needs to be produced otherwise you can't really trust that this is the real log. That's what NYTimes lawyers are after. They want the chat logs so they can do their own searches to find NYTimes text within the responses. They can't know how often that's happened and OpenAI has an obvious incentive to simply say "Oh that never happened". And the reason this evidence is relevant is it will directly feed into how much money NYT and OpenAI will ultimately settle for. If this never happens then the amount will be low. If it happens a lot the amount will be high. And if it goes to trial it will be used in the damages portion assuming NYT wins. The user has no right to privacy. The same as how any internet service can be (and have been) compelled to produce private messages. | ||||||||||||||||||||||||||||||||||||||
| ▲ | glenstein 15 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||
>That's what NYTimes lawyers are after. They want the chat logs so they can do their own searches to find NYTimes text within the responses. The trouble with this logic is NYT already made that argument and lost as applied to an original discovery scope of 1.4 billion records. The question now is about a lower scope and about the means of review, and proposed processes for anonymization. They have a right to some form of discovery, but not to a blank check extrapolation that sidesteps legitimate privacy issues raised both in OpenAIs statement as well as throughout this thread. | ||||||||||||||||||||||||||||||||||||||
| ▲ | protocolture 3 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
>NYTimes has produced credible evidence that OpenAI is simply stealing and republishing their content They shouldnt have any rights to data after its released. >That's a question they fundamentally cannot answer without these chat logs. They are causing more damage than anything chatGPT could have caused to NYT. Privacy needs to be held higher than corporate privilege. >Think about it this way. Let's say this were a book store selling illegal copies of books. Think of it this way, no book should be illegal. >They can't know how often that's happened and OpenAI has an obvious incentive to simply say "Oh that never happened". NYT glazers do more to uphold OpenAI as a privacy respecting platform than OpenAI has ever done. >If this never happens then the amount will be low. Should be zero, plus compensation to the affected OpenAI users from NYT. >The user has no right to privacy. And this needs to be remedied immediately. >The same as how any internet service can be (and have been) compelled to produce private messages. And this needs to be remedied immediately. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
| ▲ | tantalor 15 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
> The user has no right to privacy The correct term for this is prima facie right. You do have a right to privacy (arguably) but it is outweighed by the interest of enforcing the rights of others under copyright law. Similarly, liberty is a prima facie right; you can be arrested for committing a crime. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
| ▲ | realusername 11 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
> NYTimes has produced credible evidence that OpenAI is simply stealing and republishing their content. The question they have to answer is "to what extent has this happened?" Credible to whom? In their supposed "investigation", they sent a whole page of text and complex pre-prompting and still failed to get the exact content back word for word. Something users would never do anyways. And that's probably the best they've got as they didn't publish other attempts. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
| ▲ | throw20251110 7 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
> Think about it this way. Let's say this were a book store selling illegal copies of books. A very reasonable discovery request would be "Show me your sales logs". The whole log needs to be produced otherwise you can't really trust that this is the real log. Your claim doesn’t hold up, my friend. It’s inaccurate because nobody archives an entire dialogue with a seller for the record, and you certainly don’t have to show identification to purchase a book. | ||||||||||||||||||||||||||||||||||||||
| ▲ | terminalshort 8 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
Even if OpenAI is reproducing pieces of NYT articles, they still have a difficult argument because in no way is is a practical means of accessing paywalled NYT content, especially compared to alternatives. The entire value proposition of the NYT is news coverage, and probably 99.9% of their page views are from stories posted so recently that they aren't even in the training set of LLMs yet. If I want to reproduce a NYT story from LLM it's a prompt engineering mess, and I can only get old ones. On the other hand I can read any NYT story from today by archiving it: https://archive.is/5iVIE. So why is the NYT suing OpenAI and not the Internet Archive? | ||||||||||||||||||||||||||||||||||||||
| ▲ | antonvs 12 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
> The user has no right to privacy. The same as how any internet service can be (and have been) compelled to produce private messages. The legal term is "expectation of privacy", and it does exist, albeit increasingly weakly in the US. There are exceptions to that, such as a subpoena, but that doesn't mean anyone can subpoena anything for any reason. There has to be a legal justification. It's not clear to me that such a justification exists in this case. | ||||||||||||||||||||||||||||||||||||||
| ▲ | observationist 12 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
You don't hate the media nearly enough. "Credible" my ass. They hired "experts" who used prompt engineering and thousands of repetitions to find highly unusual and specific methods of eliciting text from training data that matched their articles. OpenAI has taken measures to limit such methods and prevent arbitrary wholesale reproduction of copyrighted content since that time. That would have been the end of the situation if NYT was engaging in good faith. The NYT is after what they consider "their" piece of the pie. They want to insert themselves as middlemen - pure rent seeking, second hander, sleazy lawyer behavior. They haven't been injured, they were already dying, and this lawsuit is a hail mary attempt at grifting some life support. Behavior like that of the NYT is why we can't have nice things. They're not entitled to exist, and by engaging in behavior like this, it makes me want them to stop existing, the faster, the better. Copyright law is what you get when a bunch of layers figure out how to encode monetization of IP rights into the legal system, having paid legislators off over decades, such that the people that make the most money off of copyrights are effectively hoarding those copyrights and never actually produce anything or add value to the system. They rentseek, gatekeep, and viciously drive off any attempts at reform or competition. Institutions that once produced valuable content instead coast on the efforts of their predecessors, and invest proceeds into lawsuits, lobbying, and purchase of more IP. They - the NYT - are exploiting a finely tuned and deliberately crafted set of laws meant to screw actual producers out of percentages. I'm not a huge OpenAI fan, but IP laws are a whole different level of corrupt stupidity at the societal scale. It's gotcha games all the way down, and we should absolutely and ruthlessly burn down that system of rules and salt the ground over it. There are trivially better systems that can be explained in a single paragraph, instead of requiring books worth of legal code and complexities. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
| ▲ | Hizonner 16 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
[flagged] | ||||||||||||||||||||||||||||||||||||||
| ▲ | sroussey 16 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||
> The user has no right to privacy. The same as how any internet service can be (and have been) compelled to produce private messages. This is nonsense. I’ve personally been involved in these things, and fought to protect user privacy at all levels and never lost. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||