▲ | farkin88 3 days ago | |||||||
Even though tokens are getting cheaper, I think the real killer of "unlimited" LLM plans isn't token costs themselves, it's the shape of the usage curve that's unsustainable. These products see a Zipf-like distribution: thousands of casual users nibble a few-hundred tokens a day while a tiny group of power automations devour tens of millions. Flat pricing works fine until one of those whales drops a repo-wide refactor or a 100 MB PDF into chat and instantly torpedoes the margin. Unless vendors turn those extreme loops into cheaper, purpose-built primitives (search, static analyzers, local quantized models, etc.), every "all-you-can-eat" AI subscription is just a slow-motion implosion waiting for its next whale. | ||||||||
▲ | andyferris 3 days ago | parent [-] | |||||||
I actually think Anthropic's plans with capped usage per 5-hour period and per week is good, for exactly this problem. I'd prefer it just specify a number of tokens rather than be variable on demand - I see that lets them be more generous during low periods but the opacity of it all sucks. I have 5-minute time-of-use pricing on my electricty and can look up the current rate on my phone in an insant - why not simply provide an API to look up the current "demand factor" for Claude (along with the rules for how the demand factor can change - min and max values for example) and let it be fully transparent? | ||||||||
|