What makes it even worse is the status page doesn't capture all smaller incidents. This is the same for all providers. If they actually provided real time graphs of token latency, failed requests, token/s etc I think they'd be pretty horrific.

If you trust this OpenRouter data the uptime record of these APIs is... not good to say the least: https://openrouter.ai/openai/gpt-5/uptime

It's clear to me that every provider is having enormous scale challenges. Claude Code often slows to a crawl and I have to interrupt it and tell it to try again.

This is especially pronounced around 4-6pm UK time (when we have Europe, Eastern US and West Coast US all hammering it).

Even today I was getting 503 errors from Gemini AI studio with model overloaded at that time, nothing on status page.

I really wonder if it would be worth Claude et al offering a cheaper off peak plan, to try and level out demand. Perhaps the optics of that don't look good though.

Edit to add: I think another potential dimension to this is GB200s have been a lot slower to come on stream than probably the industry expected. There's been a lot of defects with various hardware and software components and I suspect the liquid cooling has been difficult to get right (with far more catastrophic failure states!).

▲

Maxious 4 days ago | parent | next [-]

Artificial Analysis also monitor LLM provider APIs independently "based on 8 measurements each day at different times" you can see the degradation as opus 4.1 came online https://artificialanalysis.ai/providers/anthropic#end-to-end...

▲

l1n 4 days ago | parent | prev [-]

> Claude et al offering a cheaper off peak plan We do offer Batch Processing today - https://docs.claude.com/en/docs/build-with-claude/batch-proc...

	▲	martinald 4 days ago \| parent [-]
		I mean for Claude Code.