Remix.run Logo
exitb an hour ago

An operator at load capacity can either refuse requests, or move the knobs (quantization, thinking time) so requests process faster. Both of those things make customers unhappy, but only one is obvious.

codeflo an hour ago | parent | next [-]

This is intentional? I think delivering lower quality than what was advertised and benchmarked is borderline fraud, but YMMV.

TedDallas an hour ago | parent | next [-]

Per Anthropic’s RCA linked in Ops post for September 2025 issues:

“… To state it plainly: We never reduce model quality due to demand, time of day, or server load. …”

So according to Anthropic they are not tweaking quality setting due to demand.

rootnod3 an hour ago | parent | next [-]

And according to Google, they always delete data if requested.

And according to Meta, they always give you ALL the data they have on you when requested.

entropicdrifter 19 minutes ago | parent | next [-]

>And according to Google, they always delete data if requested.

However, the request form is on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying ‘Beware of the Leopard'.

groundzeros2015 13 minutes ago | parent | prev [-]

What would you like?

cmrdporcupine 34 minutes ago | parent | prev | next [-]

I guess I just don't know how to square that with my actual experiences then.

I've seen sporadic drops in reasoning skills that made me feel like it was January 2025, not 2026 ... inconsistent.

root_axis 3 minutes ago | parent [-]

I wouldn't doubt that these companies would deliberately degrade performance to manage load, but it's also true that humans are notoriously terrible at identifying random distributions, even with something as simple as a coin flip. It's very possible that what you view as degredation just "bad RNG".

cmrdporcupine a few seconds ago | parent [-]

yep stochastic fantastic

these things are by definition hard to reason about

17 minutes ago | parent | prev [-]
[deleted]
mcny an hour ago | parent | prev | next [-]

Personally, I'd rather get queued up on a long wait time I mean not ridiculously long but I am ok waiting five minutes to get correct it at least more correct responses.

Sure, I'll take a cup of coffee while I wait (:

lurking_swe an hour ago | parent [-]

i’d wait any amount of time lol.

at least i would KNOW it’s overloaded and i should use a different model, try again later, or just skip AI assistance for the task altogether.

direwolf20 an hour ago | parent | prev | next [-]

They don't advertise a certain quality. You take what they have or leave it.

denysvitali an hour ago | parent | prev | next [-]

If there's no way to check, then how can you claim it's fraud? :)

chrisjj an hour ago | parent | prev | next [-]

There is no level of quality advertised, as far as I can see.

bpavuk an hour ago | parent | prev | next [-]

> I think delivering lower quality than what was advertised and benchmarked is borderline fraud

welcome to the Silicon Valley, I guess. everything from Google Search to Uber is fraud. Uber is a classic example of this playbook, even.

copilot_king an hour ago | parent | prev [-]

If you aren't defrauding your customers you will be left behind in 2026

rootnod3 an hour ago | parent [-]

That number is a sliding window, isn't it?

sh3rl0ck 22 minutes ago | parent | prev [-]

I'd wager that lower tok/s vs lower quality of output would be two very different knobs to turn.