| ▲ | dajonker 2 hours ago | |
Wouldn't be surprised if they slowly start quantizing their models over time. Makes it easier to scale and reduce operational cost. Also makes a new release have more impact as it will be more notably "better" than what you've been using the past couple of days/weeks. | ||
| ▲ | Roark66 16 minutes ago | parent | next [-] | |
I haven't noticed much difference in Claude, but I swear gemini 3 pro preview was better in the first week or two and later started feeling like they quantized it down to hell. | ||
| ▲ | kilroy123 an hour ago | parent | prev | next [-] | |
It sure feels like they do this. They claim they don't, but using it every day for 5-10 hours a day. You notice when something changes. This last week it seems way dumber than before. | ||
| ▲ | eli 38 minutes ago | parent | prev | next [-] | |
I would be surprised tbh. Anthropic does not exactly act like they're constrained by infra costs in other areas, and noticeably degrading a product when you're in tight competition with 1 or 2 other players with similar products seems like a bad place to start. I think people just notice the flaws in these models more the longer they use them. Aka the "honeymoon-hangover effect," a real pattern that has been shown in a variety of real world situations. | ||
| ▲ | rustyhancock 2 hours ago | parent | prev | next [-] | |
Oooff yes I think that is exactly the kind of shenanigans they might pull. Ultimately I can understand if a new model is coming in without as much optimization then it'll add pressure to the older models achieving the same result. Nice plausible deniability for a convenient double effect. | ||
| ▲ | YetAnotherNick 2 hours ago | parent | prev [-] | |
Benchmarks like ARG AGI are super price correlated and cheap to run. I think it's very easy to prove that the models are degrading. | ||