▲ | data-ottawa 4 days ago | ||||||||||||||||||||||
With all due respect to the Anthropic team, I think the Claude status page[1] warrants an internal code red for quality. There were 50 incidents in July, 40 incidents in August, and 21 so far in September. I have worked in places where we started approaching half these numbers and they always resulted in a hard pivot to focusing on uptime and quality. Despite this I'm still a paying customer because Claude is a fantastic product and I get a lot of value from it. After trying the API it became a no brainer to buy a 20x Max membership. The amount of stuff I've gotten done with Claude has been awesome. The last several weeks have strongly made me question my subscription. I appreciate the openness of this post, but as a customer I'm not happy. I don't trust that these issues are all discovered and resolved yet, especially the load balancing ones. At least anecdotally I notice that around 12 ET (9AM pacific) my Claude Code sessions noticeably drop in quality. Again, I hope the team is able to continue finding and fixing these issues. Even running local models on my own machine at home I run into complicated bugs all the time — I won't pretend these are easy problems, they are difficult to find and fix. | |||||||||||||||||||||||
▲ | ruszki 4 days ago | parent | next [-] | ||||||||||||||||||||||
I don’t know whether they are better or worse than others. One for sure, a lot of companies lie on their status pages. I encounter outages frequently which are not reported on their status pages. Nowadays, I’m more surprised when they self report some problems. Personally, I didn’t have serious problems with Claude so far, but it’s possible that I was just lucky. In my perspective, it just seems that they are reporting outages in a more faithful way. But that can be completely coincidental. | |||||||||||||||||||||||
▲ | martinald 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||
What makes it even worse is the status page doesn't capture all smaller incidents. This is the same for all providers. If they actually provided real time graphs of token latency, failed requests, token/s etc I think they'd be pretty horrific. If you trust this OpenRouter data the uptime record of these APIs is... not good to say the least: https://openrouter.ai/openai/gpt-5/uptime It's clear to me that every provider is having enormous scale challenges. Claude Code often slows to a crawl and I have to interrupt it and tell it to try again. This is especially pronounced around 4-6pm UK time (when we have Europe, Eastern US and West Coast US all hammering it). Even today I was getting 503 errors from Gemini AI studio with model overloaded at that time, nothing on status page. I really wonder if it would be worth Claude et al offering a cheaper off peak plan, to try and level out demand. Perhaps the optics of that don't look good though. Edit to add: I think another potential dimension to this is GB200s have been a lot slower to come on stream than probably the industry expected. There's been a lot of defects with various hardware and software components and I suspect the liquid cooling has been difficult to get right (with far more catastrophic failure states!). | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | willsmith72 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||
> Despite this I'm still a paying customer because Claude is a fantastic product and I get a lot of value from it. Doesn't that say it all? At this point the quality of the AI trumps reliability for the customer (you and me), so even though of course they should (and I'm sure will) focus on it, why would they prioritise reliability over model quality right now? | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | lumost 4 days ago | parent | prev | next [-] | ||||||||||||||||||||||
I've become extremely nervous about these sudden declines in quality. Thankfully I don't have a production product using AI (yet), but in my own development experience - the model becoming dramatically dumber suddenly is very difficult to work around. At this point, I'd be surprised if the different vendors on openrouter weren't abusing their trust by silently dropping context/changing quantization levels/reducing experts - or other mischievous means of delivering the same model at lower compute. | |||||||||||||||||||||||
| |||||||||||||||||||||||
▲ | renewiltord 4 days ago | parent | prev [-] | ||||||||||||||||||||||
This is always why you should put as few incidents on status page as possible. People's opinion will drop and then the negative effect will fade over time. But if you have a status page then it's incontrovertible proof. Better to lie. They'll forget. e.g. S3 has many times encountered increased error rate but doesn't report. No one says anything about S3. People will say many things, but their behaviour is to reward the lie. Every growth hack startup guy knows this already. | |||||||||||||||||||||||
|