There is so much confusion on this topic. Please don't spread more of it; the answers are just a quick google away. To spell it out:

1) AI companies make money on the tokens they sell through their APIs. At my company we run Claude Code by buying Claude Sonnet and Opus tokens from AWS Bedrock. AWS and Anthropic make money on those tokens. The unit economics are very good here; estimates are that Anthropic and OpenAI have a gross margin of 40% on selling tokens.

2) Claude Code subscriptions are probably subsidized somewhat on a per token basis, for strategic reasons (Anthropic wants to capture the market). Although even this is complicated, as the usage distribution is such that Anthropic is making money on some subscribers and then subsidizing the ultra-heavy-usage vibe coders who max out their subscriptions. If they lowered the cap, most people with subscriptions would still not max out and they could start making money, but they'd probably upset a lot of the loudest ultra-heavy-usage influencer-types.

3) The biggest cost AI companies have is training new models. That is the reason AI companies are not net profitable. But that's a completely separate set of questions from what inference costs, which is what matters here.

▲

somewhereoutth 5 hours ago | parent [-]

without training new models, existing models will become more and more out of date, until they are no longer useful - regardless of how cheap inference is. Training new models is part of the cost basis, and can't be hand waved away.

	▲	SgtBastard 34 minutes ago \| parent [-]
		Only if you’re relying upon the models to recall facts from its training set - intuitively, at sufficient complexity, models ability to reason is what is critical and can have its answers kept up to date with RAG. Unless you mean out of date == no longer SOTA reasoning models?