Remix.run Logo
dajonker 2 hours ago

Wouldn't be surprised if they slowly start quantizing their models over time. Makes it easier to scale and reduce operational cost. Also makes a new release have more impact as it will be more notably "better" than what you've been using the past couple of days/weeks.

Roark66 16 minutes ago | parent | next [-]

I haven't noticed much difference in Claude, but I swear gemini 3 pro preview was better in the first week or two and later started feeling like they quantized it down to hell.

kilroy123 an hour ago | parent | prev | next [-]

It sure feels like they do this. They claim they don't, but using it every day for 5-10 hours a day. You notice when something changes.

This last week it seems way dumber than before.

eli 38 minutes ago | parent | prev | next [-]

I would be surprised tbh.

Anthropic does not exactly act like they're constrained by infra costs in other areas, and noticeably degrading a product when you're in tight competition with 1 or 2 other players with similar products seems like a bad place to start.

I think people just notice the flaws in these models more the longer they use them. Aka the "honeymoon-hangover effect," a real pattern that has been shown in a variety of real world situations.

rustyhancock 2 hours ago | parent | prev | next [-]

Oooff yes I think that is exactly the kind of shenanigans they might pull.

Ultimately I can understand if a new model is coming in without as much optimization then it'll add pressure to the older models achieving the same result.

Nice plausible deniability for a convenient double effect.

YetAnotherNick 2 hours ago | parent | prev [-]

Benchmarks like ARG AGI are super price correlated and cheap to run. I think it's very easy to prove that the models are degrading.