There's a few major problems with the article. The most obvious is that frontier labs are not charging remotely close to the cost of tokens; afaik most estimate north of 80% profit margins. As a reference, providers are profitably providing Kimi K2.6 for $4/1Mtok out. Is that as good as Opus? No, but it's probably at least Sonnet level, so that's ~4x cheaper than Sonnet while still being profitable to serve on the margin. So you aren't plausibly getting into actual subsidization territory until you're over 5:1 sub to nameplate token costs.

How many tokens can you realistically burn through in one chat session? Opus and many other frontier models do maybe 60tok/s, less 250k/hr out. In you can use more, but in most cases cache is 5-10:1 cheaper than new input. Say you average 500ktok in, 90% cache, per request. That amounts to 100-150ktok in new input-equivalent costs, which in most cases is ~20-30ktok in output-equivalent costs. Do a request every minute, that's a total of about 1.5-2Mtok/hr. At API prices that's $50/hr for Opus, but really it probably only costs Anthropic $10/hr to serve that.

That said, even if a developer is burning $50/hr, many, many employees at large companies cost more than $100k/yr to employ all costs considered, so making them say 20-30% more productive can easily make that worth it for most. If the labs shave their margins ultimately to more like 20-30%, you'd have ~$15/hr in costs to use the services, and nearly every white collar job is way over 30k/yr to employ. If your salary is 80k, you probably cost the company 200k all in, so making you 15% more productive offsets the $15/hr cost.

So first party providers are not in a horrifying position or anything from a subsidization standpoint. The people in bad shape are Cursor and Perplexity, who don't have frontier models and are dependent on the open source community, which is typicly 6-12 months behind the frontier. They have to pay full freight API costs at 80% margin for the big boys to serve their harnesses, which is indeed untenable, and they'll have to either force users to use open source models and/or in house models they can serve at-cost or they will have to charge vastly more.

Gemini, Claude, and ChatGPT first-party services like Antigravity, Codex, and Claude Code are not in serious trouble though.

▲

zozbot234 2 days ago | parent | next [-]

It's not even a fixed cost per token (even though it's billed that way, and that's still miles better than a fixed-price all you can eat). You're incurring a cost that's proportional to generated tokens times the context for each (plus the prefill cost for any uncached input), so the expense grows quadratically with your average generated context.

This all becomes extremely visible when trying to do agentic coding with local language models - you quickly realize that controlling context length and model size is just as important as avoiding wasted effort. The real scam is not AI Q&A ala ChatGPT, that's actually quite viable - though marginally less so as conversations grow longer. It's agentic coding with SOTA models and huge contexts.

▲

GaggiX 2 days ago | parent [-]

Using larger contexts often costs more in the APIs or consume more of your quota but this is becoming less of a problem with models using more clever attention mechanisms and not just full attention on all layers.

You can look at: https://sebastianraschka.com/llm-architecture-gallery/ and see how much things have changed.

	▲	margalabargala 2 days ago \| parent [-]
		This is also something of a non issue because as context grows and attention gets diluted, the models perform worse. It'll cost Anthropic more to run your 900k context session, yes, but it's in your interest not to have a 900k session in the first place.

▲

boelboel 2 days ago | parent | prev | next [-]

Isn't this akin to saying Big Pharma companies could easily make money if they just stopped doing expensive research? The massive R&D spend is the core of the business plan; it's the only reason they can demand high prices in the first place. Once OpenAI stops spending billions on training, their pricing power vanishes because users will just migrate to Anthropic or whoever releases the next frontier model. Would imply there'd be space for only one to outlast them all in some sort of war of attrition (perhaps similar to silicon industry).

	▲	kimetime 2 days ago \| parent [-]
		Big Pharma does seem like a good comparison for frontier lab business model. Doesnt really have the patent protection or distinct diseases pharma does, Wonder if labs start more heavily branding “specialties” instead of general capabilities to develop some differentiation

▲

tobbe2064 2 days ago | parent | prev | next [-]

Your math is pretty bad 50$/h is a yearly cost of going by swedish standards, 50$/h × 40h/week × 48 weeks / year = 96k$/year At that rate is a really shitty bargin for 30% increase in productivity. Even if you drop it to 20$/h and sort of break even, you are loosing competens building and teory building, decreasing the likeleyhood of making architectual progress and risk getting bogged down in a swamp.

	▲	joshjob42 16 hours ago \| parent [-]
		An employee often costs a company 2-3x their salary, so someone making 100k a year, costing 300k/yr, who is made 33% more productive (100k more worth of work to the company) offsets the compute cost.

▲

loeg 2 days ago | parent | prev | next [-]

> How many tokens can you realistically burn through in one chat session?

I've used single digit billions in a couple days, FWIW.

▲

kcartlidge 2 days ago | parent | next [-]

I'm a fair bit lower than some others as I only use it outside of work hours on my own small projects, but my Cursor account shows (for a random recent date) 12,184,233 tokens in a day. That day feels pretty representative.

That's with 86 interactions spread intermittently over a couple of hours so if I did a full working day like that I'd be looking at maybe 40 to 50 million.

	▲	loeg a day ago \| parent [-]
		My employer is paying for it, so I'm cost insensitive, and this is mostly with Claude / Opus 4.7 (which consumes a lot of tokens?).

▲

bwestergard 2 days ago | parent | prev [-]

What sort of work were you doing?

▲

loeg 2 days ago | parent | next [-]

Converting a couple hundred kLOC C++ codebase to Rust.

▲

bwestergard 2 days ago | parent [-]

Cool. Sounds like it went well?

	▲	loeg 2 days ago \| parent [-]
		Maybe! Still evaluating if the output does what it's supposed to do.

▲

xienze 2 days ago | parent | prev [-]

Not the parent, but the way developers are basically trying to create entire development "teams" consisting of multiple agents that work around the clock using the latest, most expensive models (naturally) lends itself to burning insane amounts of tokens.

▲

lbreakjai a day ago | parent | prev | next [-]

Problem with this math is it always assumes some ridiculous baseline compensation (or costs, in this case) as a matter of fact. There's an entire world of developers not costing 200k to their employers.

Truth of the matter in most companies large enough is if you make your devs 30% more productive, then that'd mean 30% more code going through "change management" hell for months. You're not even paying to stand still, you're just pushing even more down a bottleneck. The price most people are willing to pay to make things worse is close to zero.

▲

Balinares a day ago | parent | prev | next [-]

> providers are profitably providing Kimi K2.6 for $4/1Mtok out.

Do you perchance have a source for this? Is the profitability assessment comprehensive, including hardware amortization? I've found it hard to track down actual hard numbers for the cost of inference.

▲

intended 2 days ago | parent | prev | next [-]

> afaik most estimate north of 80% profit margins

This seems to be the lynchpin of your argument.

It makes me wonder if I have been living under a rock, because I have never heard of frontier labs making money. AFAIK all AI firms are simply burning money to acquire customers at this stage. Is this wrong?

▲

asdfasgasdgasdg 2 days ago | parent | next [-]

>It makes me wonder if I have been living under a rock, because I have never heard of frontier labs making money.

You're confusing the profit from the marginal token and overall profit (basically gross margin and operating margin). The comment you're replying to is calculating that AI labs are probably making a substantial profit per paid token. It's just that so far that profit has not been able to overcome the ongoing R&D and capex costs.

▲

kgwgk 2 days ago | parent [-]

> not been able to overcome the ongoing R&D and capex costs.

And the cost of not-quite-paid tokens.

▲

margalabargala 2 days ago | parent [-]

Which may or may not exist, hence this thread.

▲

kgwgk 16 hours ago | parent [-]

Non-paid tokens do definitely exist and they weren’t included in the remark about “substantial profit per paid token”. Underpaid/subsidized tokens also exist which don’t provide “substantial profit”.

	▲	margalabargala 16 hours ago \| parent [-]
		Are you talking about free promo tokens the company gives out, or are you implying that subscription tokens are sufficiently subsidized so as to be below cost?

▲

pmdr 2 days ago | parent | prev | next [-]

People tend to believe OpenAI and Anthropic can make money any time, the only thing they need to do is to stop training newer/better models. Source? Sam & Dario, of course (trust us, bro). It may (if they sell access at API price) or may not be true, but the scenario where training is stopped is simply unrealistic at this point.

▲

dgellow 2 days ago | parent | prev [-]

I’m not exactly sure of the details but I believe they do make _some_ money on inference. But they then have to reinvest it all into training of the next model to stay competitive. So even if inference is positive (I’m seeing inconsistent reported data if that’s the case or not), it is directly spent.

I do not understand how the companies can end up in positive, unless something fundamental changes

▲

doctorpangloss 2 days ago | parent | prev | next [-]

lots of words.

do you think per token prices will go up or down in the long term? will the price per task trend down or up?

what about the price of human labor?

▲

redox99 2 days ago | parent | next [-]

He is proving that the article is based on false information.

Prices going up or down depends on what labs decide and what users demand. Strong models being profitable at lower prices than what frontier labs offer is a fact.

▲

roywiggins 2 days ago | parent | prev | next [-]

not nearly as many words as Ed Zitron at least

▲

GardenLetter27 2 days ago | parent | prev [-]

The price of everything will go down. That is the beauty of the free market.

▲

rspeele 2 days ago | parent | next [-]

If the price of everything would go down it wouldn't be too concerning and everybody would be on board with the "beauty" of it.

What seems to actually be happening for white collar workers is that the price they can charge for their labor is dropping, but the price of their expenses (housing, food, gas) continues to rise.

▲

Yizahi 2 days ago | parent | prev | next [-]

In the absolutely free market price will go up a lot in the end. Because only one monopoly will exist by that time and it will jack up prices to the maximum tolerable level. And that level can be surprisingly high, because in every human activity there will be few willing to spend crazy amounts of money for practically anything they perceive valuable.

▲

mike_hearn a day ago | parent [-]

This kind of argument relies on odd definitions of "truly free" that boil down to anarchism, which isn't what anyone who advocates for a free market means.

▲

Yizahi a day ago | parent [-]

So what does free market mean then?

▲

mike_hearn a day ago | parent [-]

To me at least, it means a market in which the basic rules of commerce are enforced but beyond that the government doesn't micromanage. For example, contracts are enforced, there's some basic truth in advertising laws, there's a trustworthy currency available, and all the other basics of civilization like "your competitor isn't allowed to murder you".

It's obviously a fuzzy scale.

In a free market like that it's not guaranteed that everything ends in monopoly. Actually mostly it won't. Monopolies that do occur are due to high costs of entry and are usually temporary.

▲

Yizahi 21 hours ago | parent [-]

In the market you have described we will inevitably end with a monopoly in everything, simply because you didn't mention anything preventing that. To avoid monopoly a much more micromanaging government is required. At minimum we would need a specialized bureaucracy department investigating monopolies, an advanced legislative and judicial systems enforcing such laws, a lot of regulation regarding common social good (e.g. you can't just undercut competitors by selling poisonous shit, and you can't just bribe law enforcement to do the same), we would need an overreaching borders/customs/tariffs to block companies from countries not concerned about selling poisonous shit to undercut foreign competitors. And the list goes on.

Basically free market advocates fail to see more that a single step in the complex web of dependencies, which tries to prevent neo-feudal monopolization of everything by unchecked, unelected and being above most laws and taxes, robber barons.

I dislike unnecessary bureaucracy and excessive government control as much as anyone, I was born in the authoritarian USSR after all and I do study history. But I fear neo-feudalism even more. I certainly have zero self-delusions about being in a "ruling class" in that potential free market dystopia.

	▲	mike_hearn 2 hours ago \| parent [-]
		It's not that we can't see them - I literally named some examples. But where is the evidence for your specific claims, because there's plenty of evidence against them. Markets without much regulation are routinely very competitive. Look at the computing industry, which for most of its history had no industry-specific regulations at all beyond the illegalization of hacking - a simple extension of private property rights. And the effect by which regulation actually strengthens incumbents and reduces competition is well known. A common problem in these discussions is conflation of different goals. You talk about companies selling "poisonous shit". That's not a competition related goal so has nothing to do with anything I've been saying. It's an environmental goal. Governments often pass environmental law fully accepting that it will reduce competition and might strengthen or even create new incumbents - and they don't care! In fact most environmental law is like that because it's exactly as you say, other countries like China don't pass such laws and out-compete local firms as a consequence. But that's not a failure of the free market. It's a failure of environmental law. Or, sometimes not even a failure, just a known tradeoff. As a general rule it's hard to find markets that are controlled by monopolies over the long run without government regulation being to blame. Temporary monopolies can arise naturally and there's nothing wrong with that, but over time they usually fall by the wayside unless a law is preventing that from happening.

▲

dgellow 2 days ago | parent | prev [-]

The free market hypothesis is about resource allocation, nothing to do with price of everything going down

▲

ToucanLoucan 2 days ago | parent | prev [-]

> That said, even if a developer is burning $50/hr, many, many employees at large companies cost more than $100k/yr to employ all costs considered, so making them say 20-30% more productive can easily make that worth it for most. If the labs shave their margins ultimately to more like 20-30%, you'd have ~$15/hr in costs to use the services, and nearly every white collar job is way over 30k/yr to employ. If your salary is 80k, you probably cost the company 200k all in, so making you 15% more productive offsets the $15/hr cost.

Nobody including the connected article is making the argument that this cannot be profitable ever. People are saying "there is no way this admittedly quite interesting tool is going to be able to make back all of this money" and I think they are completely right to say that.

You can absolutely make money with this stuff, just not at this scale. The buildout for this shit has been certifiably crazy and a number of the involved firms are overleveraged for tens and even hundreds of billions of dollars.

How in the sweet fuck are you paying that off, plus giving investors dividends, selling this at $15/hour/user??? That math does not math. A quick google says there are between 1.5 and 4.4 million developers in the US alone, let's say it's 5 million, to be generous, and each of them is subbed to this for 8 hours per day, continuously. That's 600 million per year in revenue. If you took ALL that revenue, and put it towards paying down this debt, not leaving any for employee salaries, upkeep, ongoing development, it would take DECADES to pay down what OpenAI already owes.

And yes I'm sticking directly to code, because that's the only thing I've seen it be really good at. Are we really proposing that every knowledge worker on earth and every manager of such workers is going to have an autonomous agent running all the time!? To do what, make sure they don't have to read or write email? Which even just that example is bringing in a fucking mess of legal, compliance, and security violations because LLMs are not intelligent and are not capable of being properly secured.

Like I'm sorry, I cannot take this industry seriously when even the most basic back-of-napkin math is saying, nay, screaming from the rooftops that they are FUCKED.

▲

belval 2 days ago | parent | next [-]

> selling this at $15/hour/user??? That math does not math. A quick google says there are between 1.5 and 4.4 million developers in the US alone, let's say it's 5 million, to be generous, and each of them is subbed to this for 8 hours per day, continuously. That's 600 million per year in revenue

That math is not mathing. $15/hour/user, with 5M devs, 8hrs and 240 working days per year that is 144B in revenue.

▲

vidarh 2 days ago | parent | prev | next [-]

By your numbers, it'd be $120/day per developer * 5 million = $600m per day, not per year.

Of course people don't work every day, but even with European-level holidays that number is off by a factor of 240 or so.

▲

ToucanLoucan 2 days ago | parent [-]

Quite right, honestly not sure how I fucked that up so bad but I'll own it. Okay so all we need is every coder + 0.6 million more or so in the United States, subscribed to this for 8 hours a day, and the business model can work.

That still feels incredibly optimistic given how split the community at large seems to be about how good this tech is, and it assumes all those developers also all work for firms large enough to pay for all of that.

However we are still very much in back of napkin math. We haven't even gone into what it costs to provide these services, how much it's going to cost yet for all these datacenters to be built, how much electricity and water they're going to rip through, their own employees and basic overhead, and all the rest. So IMO, we've now elevated it from "hopeless" to "this could work if a whole lot of other things line up really well."

	▲	asdfasgasdgasdg 2 days ago \| parent \| next [-]
		It's not just developers who are using this. My economist friends are. I bet most business analysts and general administration folks are or will be soon. Every normal person I know in my neighborhood is using AI for this thing or that. 50M people are currently subscribed to ChatGPT and it would be very surprising if this number goes down in the future. I dunno I think about the language some people are using about AI investment and it is reminiscent of the many years where people were saying Amazon was a bad buy because they never turned a profit. Admittedly AI companies are investing more than the money they've already brought in, but I would be very hesitant to predict that it's all froth given the usefulness I've gleaned from the tools. Don't get me wrong, I'm not unconcerned, but I think there are good reasons to suspect that at least some of the AI companies are making sound investments.
	▲	vidarh a day ago \| parent \| prev \| next [-]
		My fiancees company has no developers, yet everyone has a paid subscription to LLMs. Certainly not $15/hour, and I don't think it's likely they'll ever pay that for everyone, but I don't find it hard to picture the aggregate cost of subscriptions on a global basis to far exceed $600m/day between far more people on subscriptions cheaper than $15/hour but more expensive than today, and companies ending up paying far more than $15/hour averaged over their developers for additional use. E.g. I already run agents 24/7 just for me. I couldn't yet justify $15/hour, but the amounts I'm spending is steadily increasing as I manage to squeeze returns from more and more things. Sure, it's back of napkin math, and I also think that several of the companies we see today won't survive and/or will only survive due to consolidation, but I also think the spend is going to be immense. With respect to the datacentres, I expect we'll see inference costs crash over the coming years - we're only seeing the beginning of what dedicated ASICs will do to inference, and what work to make models more efficient will do to the need for the very largest models, and while that might drive down the spend on individual subscriptions, I think it will drive up the total spend dramatically as cheaper models become capable enough to put them "everywhere". But, yeah, ultimately we're guessing. I'm happy to put my guesses on the record, though, and look forward to look back and see how wrong I got it in a couple of years.
	▲	2 days ago \| parent \| prev [-]
		[deleted]

▲

Maxatar 2 days ago | parent | prev | next [-]

You wrote an entire wall of text when you could have just taken 10 seconds to review what you call the "most basic back-of-napkin math" and realized you were off by two and a half orders of magnitude.

▲

strongpigeon 2 days ago | parent | prev [-]

> That's 600 million per year in revenue.

According to your math, that's $600 million per day

	▲	marcosdumay 2 days ago \| parent [-]
		Yes, the GP wrote the wrong unit on this place. That supports his conclusion that the pay-off would take decades, if it was actually per year, it would take several centuries.