Remix.run Logo
ispeaknumbers 4 days ago

I'm not sure if you can claim these were "less prevalent than anecdotal online reports". From their article:

> Approximately 30% of Claude Code users had at least one message routed to the wrong server type, resulting in degraded responses.

> However, some users were affected more severely, as our routing is "sticky". This meant that once a request was served by the incorrect server, subsequent follow-ups were likely to be served by the same incorrect server.

30% of Claude Code users getting a degraded response is a huge bug.

extr 4 days ago | parent [-]

I don't know about you but my feed is filled with people claiming that they are surely quantizating the model, Anthropic is purposefully degrading things to save money, etc etc. 70% of users were not impacted. 30% had at least one message degraded. One message is basically nothing.

I would have appreciated if they had released the full distribution of impact though.

lmm 4 days ago | parent | next [-]

> 30% had at least one message degraded. One message is basically nothing.

They don't give an upper bound though. 30% had at least one message degraded. Some proportion of that 30% (maybe most of them?) had some larger proportion of their messages (maybe most of them?) degraded. That matters, and presumably the reason we're not given those numbers is that they're bad.

mirekrusin 4 days ago | parent | prev | next [-]

Routing bug was sticky, "one message is basically nothing" is not what was happening - if you were affected, you were more likely to be affected even more.

dytyruio 4 days ago | parent | prev | next [-]

> Anthropic is purposefully degrading things to save money

Regardless of whether it’s to save money, it’s purposefully inaccurate:

“When Claude generates text, it calculates probabilities for each possible next word, then randomly chooses a sample from this probability distribution.”

I think the reason for this is that if you were to always choose the highest probable next word, you may actually always end up with the wrong answer and/or get stuck in a loop.

They could sandbag their quality or rate limit, and I know they will rate limit because I’ve seen it. But, this is a race. It’s not like Microsoft being able to take in the money for years because people will keep buying Windows. AI companies can try to offer cheap service to government and college students, but brand loyalty is less important than selecting the smarter AI to help you.

andy99 4 days ago | parent | next [-]

> I think the reason for this is that if you were to always choose the highest probable next word, you may actually always end up with the wrong answer and/or get stuck in a loop.

No, it's just the definition of sampling at non-zero temperature. You can set T=0 to always get the most likely token. Temperature trades of consistency for variety. You can set T to zero in the API, I assume the defaults for Claude code and their chat are nonzero.

efskap 4 days ago | parent | prev [-]

>or get stuck in a loop

You are absolutely right! Greedy decoding does exactly that for longer seqs: https://huggingface.co/docs/transformers/generation_strategi...

Interestingly DeepSeek recommends a temperature of 0 for math/coding, effectively greedy.

flutas 4 days ago | parent | prev [-]

That 30% is of ALL users, not users who made a request, important to note the weasel wording there.

How many users forget they have a sub? How many get a sub through work and don't use it often?

I'd bet a large number tbh based on other subscription services.

smca 4 days ago | parent | next [-]

(I work at Anthropic) It's 30% of all CC users that made a request during that period. We've updated the post to be clearer.

flutas 4 days ago | parent [-]

Thanks for the correction and updating the post.

I typically read corporate posts as cynically as possible, since it's so common to word things in any way to make the company look better.

Glad to see an outlier!

extr 4 days ago | parent | prev [-]

That's a pretty cynical read. My personal impression is that Anthropic has a high level of integrity as an organization. Believe what you want, I'm inclined to give them the benefit of the doubt here and move on.

kashunstva 4 days ago | parent [-]

> My personal impression is that Anthropic has a high level of integrity as an organization.

Unless you consider service responsiveness as a factor of integrity. Still waiting on a service message reply from third week of May. I’m sure it’s right around the corner though.