Expect to see more of these kinds of announcements as companies need to start showing returns on their AI investments. It's hard to say how subsidized the current AI products are[1] but we're definitely getting a free lunch at VC's expense the moment.

[1] Ed Zitron speculates the actual prices with token based billing for heavy users will be something like 10x the subscription price, but this seems high.

▲

Leynos 2 days ago | parent | next [-]

Not that I give much credence to anything Zitron says, but the amount of inference you can get on a £200 a month OpenAI or Anthropic subscription is easily an order of magnitude more than what you'd get paying the same amount at subscription rate.

Although I would also point out that OpenAI recently tripled the amount of Codex inference you get per month for £200 (and to head off the suggestion, this is distinct from their current 2x promotion on £100/month plans)

▲

PunchyHamster 2 days ago | parent | next [-]

> Not that I give much credence to anything Zitron says, but the amount of inference you can get on a £200 a month OpenAI or Anthropic subscription is easily an order of magnitude more than what you'd get paying the same amount at subscription rate.

Neither of those is how much it actually costs the company selling the service. And I have feeling they are running at loss here so the play is "get everything possible using LLMs then jack up the pricing"

▲

semiquaver 2 days ago | parent [-]

There have been plenty of studies which indicate that inference considered by itself is almost certainly quite profitable at all the frontier labs. The problem is amortizing the cost of all the expensive training runs required to train new models into the revenue stream.

▲

pkaye 2 days ago | parent [-]

Does that mean those running the open models are highly profitable since they don't have to do any training?

	▲	semiquaver 2 days ago \| parent \| next [-]
		I don’t know about highly since they have no moat even more than Antrhropic and OpenAI have no moat. Anyone with a few hundred thousand dollars or sufficient free GPUs can compete with them. So running an open model should earn a market-rate margin.
	▲	polski-g 2 days ago \| parent \| prev [-]
		Yes obviously, otherwise they wouldn't be doing it; they'd just go back to mining shitcoins.

▲

o10449366 2 days ago | parent | prev | next [-]

Yeah, I'm sure the numbers are a bit inflated compared to API, but with my Claude $200/month subscription I've supposedly consumed 12,160,410,828 tokens in April for a cost of $22,733.03.

▲

Leynos 2 days ago | parent [-]

Is that taking cache hits into account?

	▲	o10449366 2 days ago \| parent [-]
		Cache create is 202,746,985 and cache read is 11,998,411,722 from claude-code-monitor

▲

paulddraper 2 days ago | parent | prev [-]

*more than what you'd get paying the same amount at usage rate.

	▲	Leynos 2 days ago \| parent [-]
		Yes, thanks. Too late to edit now, sadly.

▲

pier25 2 days ago | parent | prev [-]

> 10x the subscription price, but this seems high

Inference is cheap but training is quite expensive. Plus all the money they've invested and keep investing on hardware, data centers, etc. And evidently they also need to make a profit at some point.

▲

xienze 2 days ago | parent [-]

> Inference is cheap

Maybe from the perspective of traditional, turn-based chat. But when you start having developers command an army of agents that work around the clock, those cheap tokens start adding up fast...

▲

mbb70 2 days ago | parent [-]

If the unit-economics work out and they can sell $0.99 of tokens for $1.00, doesn't matter how many agents you spin up. The flat rate subscriptions can't last though.

▲

xienze 2 days ago | parent [-]

> If the unit-economics work out and they can sell $0.99 of tokens for $1.00

I think the margins have to be a lot higher than that in order to give investors the return they're expecting, to continue the never-ending training treadmill, and to build more and more datacenters to accommodate people basically DDOS'ing the GPUs in order to run their workloads.

Yes, in theory what you said makes sense. But the tightrope these companies have to walk is that the per-token costs still have to be low enough that developers and companies don't just say "ehhh I guess we can still do all this work the old-fashioned way" but ALSO high enough to cover the massive expenses AND astronomical returns everyone's expecting.

▲

maccard 2 days ago | parent [-]

VC investment isn’t about margins, it’s about finding a unicorn. It doesn’t matter if margins are negative if your product is dominant in the market as you can fiddle with the margins after the fact. You just need to be invested long enough to see everyone else fail.

	▲	AlexandrB 2 days ago \| parent \| next [-]
		The problem with AI is that there doesn't seem to be a durable barrier to entry for a "winner take all" dynamic to work. The biggest barrier to entry seems to be the capital needed to train the models, but even free models are getting "good enough" for some uses and there's little friction to stop users from switching between models. Many frontends make this explicit by letting you pick the model you want to run inside the same environment. If prices go up, I suspect a bunch of folks will jump to cheaper, less capable models instead of eating the added cost. The whole value proposition of AI in enterprise is around cost-cutting, so that mentality is likely to persist when choosing which model to pay for.
	▲	xienze 2 days ago \| parent \| prev [-]
		I imagine the calculus changes a little bit when you've invested hundreds of billions (trillions?) of dollars in a relatively short period of time. Priority number one is probably getting that money back. I think the fact that providers are RAPIDLY cutting back/jacking up prices points to this being the case.