Remix.run Logo
pczy 8 hours ago

This policy applies across all providers. Here is the warning in Cursor: https://i.redd.it/7sfyker2ya6h1.png

Note that Anthropic has committed not to train models on logged data, so I don’t understand some of the concerns here. What exactly is your threat model? That Anthropic would train models contrary to their terms of service? That you trust them enough not to log your data prior to this, but not enough to trust their stated limits on how logged data will be used now?

Edit: I am partially convinced by some of the replies. However, it is worth noting that this change primarily affects Enterprise users. Data from consumer plans is already retained for 30 days. Source: https://privacy.claude.com/en/articles/10023548-how-long-do-...

zmmmmm 7 hours ago | parent | next [-]

> you trust them enough not to log your data prior to this, but not enough to trust their stated limits on how logged data will be used now

It doesn't really matter how much you happen trust another party. In the regulatory world it only matters what contracts they will sign that guarantee their compliance. We do have those with AWS, we don't with Anthropic. If Anthropic physically captures the data, they just moved themselves outside the boundary of parties who we can do business with. Unless they want to sign a contract and implement all the corresponding compliance measures. They are insane if they think that's a good deal for them to do all that right now in every jurisdiction where AWS operates, when AWS has already spent a decade building it up.

tyingq 7 hours ago | parent [-]

It will absolutely cause some non-trivial number of customers to shift their configs away from Anthropic.

aveao 6 hours ago | parent | next [-]

It's worthwhile to remember that this is only true of Mythos/Fable and other future models of "similar or higher capability levels" (ant is treating this as a new tier of model above Opus). Anyone who's already been happy using Haiku/Sonnet/Opus on Bedrock will not be affected by this at all.

abofh 5 hours ago | parent | next [-]

Yes and no. Anthropic controls what is determined to be "similar or higher" and when models are deprecated. Will sonnet 4.7 be "too powerful"? Because once it's released. 4.6's days are numbered.

This created a huge future risk for our org and we're already scheduling meetings over it. Regulated industry, we can't lose control over our data governance or residency controls, let alone the lack of visible audit trails that could reveal customer or PII.

Just an absolute bomb of a release

nijave 2 hours ago | parent | prev [-]

>Anyone who's already been happy using Haiku/Sonnet/Opus on Bedrock will not be affected by this at all

It is still adding operational overhead because we now need to vet all models and deny access to any retaining data

Previously it was "use and experiment with anything Bedrock offers--the data stays in AWS so we are not concerned"

jerf 7 hours ago | parent | prev | next [-]

Which will work for the several weeks it takes for the other commercial providers to follow suit.

The tides are turning. AI companies are IPO'ing. They've gotten where they are by selling $5 bills for $1, to update the old VC adage. I think we can look forward to them rewriting the contracts, both literal and social, on AI going forward to capture a lot more of the value. Or, to put it in more HN-friendly terms, it may not be immediately obvious on a casual viewing, but you're looking at the beginning of the enshittification process hitting AI. The term is a bit deceptive in some sense, because it's not like anyone ever sets out with a terminal goal of making something shitty. It's downstream of trying to capture more value in the customer/vendor relationship by not giving the customer any more value than is barely necessary.

How's coding with qwen doing? The only thing that's going to stop the AI providers from extracting all the value until it's just barely worth using is the free competition.

abofh 6 hours ago | parent | next [-]

Bedrock supports many models. Open weights models aren't far behind, maybe a year, 18 months.

Given they could have done this with data residency rules being respected and chose not to suggests all I need to know - this is for Anthropics IPO, not for user safety

pixl97 5 hours ago | parent [-]

>Open weights models aren't far behind, maybe a year, 18 months.

No, open weights are always a year behind +. By the time that year passes Anthropic/OpenAI/Google will have some new model that is ahead of the open models by a year.

Looking at computer security for the last 30 years, no one gives a fuck about user safety. Companies care about profits, and individuals don't care enough for strong laws.

We'll be back here in another year on HN talking about why we should give our retina sample and blood to Anthropic to use the model with a ton of people doing it. It's just the way humans are.

tyingq 4 hours ago | parent | prev [-]

Surely some provider will see the then open opportunity and offer something to capture it.

tokioyoyo 4 hours ago | parent | prev [-]

You’re underestimating how much companies are willing to bend over backwards if they can “get ahead with a god model” compared to their competitors.

nicce 7 hours ago | parent | prev | next [-]

> Note that Anthropic has committed not to train models on logged data, so I don’t understand some of the concerns here. What exactly is your threat model? That

Like Meta had committed to respect your privacy. Replace the name of the company with any of the top 50 companies in the world and go back how many have hold their promises - or just doing fine when breaking the rules. There is no legislation in the U.S. that can bankrupt the company for violating this? So there are no guarantees.

Meta openly torrented books and nobody asked them to remove/destroy their AI models. Similarly, for Anthropic, it was just a business cost. They were allowed to keep the models. No real consequences for breaking the rules.

kevincox 7 hours ago | parent | prev | next [-]

It adds another provider that you have to trust with your data. Previously the assumption is that AWS was securely handling your data and you may have the data on AWS to start with anyways. Now you have two providers handling your data which doubles your risk if you trust them equally. If you think AWS has more robust data controls than Anthropic then it more than doubles your risk.

You may also have data management requirements such as allowed storage and transit countries as well as various certifications and contracts that you now need to extend to the second data processor.

Basically if you are already using AWS just adding the AWS-only bedrock model is legally easy and doesn't really change your security posture. If you need to now also log your data to Anthropic it makes the choice much more complicated.

_jab 7 hours ago | parent | prev | next [-]

Both can be true simultaneously. Anthropic can probably be trusted not to train on our Fable sessions, but eroding ZDR as the industry standard still sets a dangerous precedent.

There's a parallel between data retention and general mass surveillance. Sure, both systems can be used for purely benign purposes, with appropriate safeguards in place. But history shows that surveillance systems are alarmingly easy to co-opt for nefarious means, and model providers do have a heck of an incentive to leverage retained data for internal means.

This is worth protesting, even if I believe this policy itself does not immediately compromise my privacy.

krzyk 7 hours ago | parent | prev | next [-]

> Note that Anthropic has committed not to train models on logged data, so I don’t understand some of the concerns here. What exactly is your threat model? That Anthropic would train models contrary to their terms of service? That you trust them enough not to log your data prior to this, but not enough to trust their stated limits on how logged data will be used now?

It is a different thing when they say they don't store your data.

And when they say they store your data for 30 days and review it for "issues", it makes your "spider sense" tingle. Who and how will review it, what are the "issues" they are looking for, etc. It is to vague and they can keep it this "dangerous" model for themselves.

technojamin 7 hours ago | parent | prev | next [-]

Someone has never dealt with HIPAA laws and it shows.

aveao 6 hours ago | parent [-]

Who out there is going to be feeding patient medical data to Mythos/Fable?

shakna 5 hours ago | parent | next [-]

Whoever Anthropic can convince, to help them form a competitor to OpenEvidence, who already feed patient medical data into their systems.

kube-system 5 hours ago | parent | prev [-]

...the same groups who are currently feeding it to Sonnet and Opus?

Well, they won't be feeding it to Fable unless Anthropic can provide a signed BAA.

HarHarVeryFunny 7 hours ago | parent | prev | next [-]

Once you start storing anything, whether credit card numbers or AI inputs, then there is possibility (if not in fact probability) that you'll be hacked and it will leak.

Given Anthropic's failure to secure their own source code, do you really trust them to secure yours?

throw1234567891 6 hours ago | parent | prev | next [-]

> Note that Anthropic has committed not to train models on logged data, so I don’t understand some of the concerns here. What exactly is your threat model?

First of all, will they respect that promise in the future? Because, you know… they already received your data and by some legal quirks they are already required to store it for so many years. “What’s your threat model”, uhh, sending confidential information to a third party.

It’s okay if you do this with your own personal property. But if you are working on client projects, what, are you going to start shipping customer data under nda without consent? Good luck in the court.

zsoltkacsandi 7 hours ago | parent | prev | next [-]

We shipped software to governments and some big companies where this is a big no-no. Try to explain to your clients that during the development process some pieces were sent to Antrophic, and they might keep it for whatever reasons.

doctorpangloss 6 hours ago | parent | prev [-]

here's how they train on your data:

an inference request comes in

claude fable RESTful API service does the stuff, some backend systems run the prefill and batch decode, and your conversation is cached for 5 minutes in some prefix cache.

the request is also sent to claude paraphraser, which does almost exactly the same thing as the compactor and rewrites your conversation.

then they record the paraphrased conversation and train on that. it keeps the salient parts of the conversation, like whatever internal knowledge you have, and disposes of anything that could have been correlated with the earlier conversation, which is easy to do because verification is a string comparison.