Remix.run Logo
hibikir 3 days ago

The amounts of API tokens many large companies are using through, say AWS bedrock are quite high. We've seen leaks on the bills for real world use cases. It's not unreasonable to see normal individual subscriptions as possibly subsidized.... but do we think someone like Anthropic is going to be subsidizing 7, 8, or even 9 figures monthly bills from megacorps? Because said megacorps will swap out to a competitor immediately, so your subsidy is unlikely to lead to loyalty or anything.

If Anthropic and OpenAI are subsidizing the metered API usage, their model is going to end up just as successful as MoviePass. They are burning enough money on the training costs already.

dakolli 3 days ago | parent [-]

Large companies are paying an arm and a leg, but I'm still certain even at $15.00 per million tokens they are not profitible.

If you have a machine running at 150 tok/ps you can only make $5820 a month at $15 per 1mm running 24/7. It costs a hell of a lot more than 6k a month to run Claude 4.7 @ 150 tok/ps on that machine 24/7.

This math is a bit off, because you have input tokens too, but regardless its still not profitable especially for how long it takes to turn around a request and the caching is probably not all that profitable.

NitpickLawyer 3 days ago | parent [-]

You are all over this thread, but you have no idea how inference works, and it's obvious. Your napkin math is off because you don't know what to add up, you lack the necessary background. And yet you persist and reply all over this thread. I don't get it.

Serving models on dedicated hardware is not the same as your at home 150t/s thing. Inference is measured in thousands of tokens / s in aggregate (i.e. for all the sessions in parallel). That's how they make money.