Remix.run Logo
dijit 5 hours ago

Frontier AI companies are selling at a loss.

Excusing everything else that u/bastawhiz said[0]; the obvious fact here is that Claude, OpenAI, Gemini et al. are quite literally burning through 100's of billions of dollars and selling it back to you for pennies on the dollar in the hopes that they get to be the only one left.

If I spend $10 growing Oranges and sell them to you for $1; then of course it's more expensive for you to do the growing.

I feel like I'm taking crazy pills. These models will become more expensive over time, it's functionally impossible for them not to, they just want to capture the market before they have to stop selling at a huge loss.

[0]: https://news.ycombinator.com/item?id=48168433

vanviegen 4 hours ago | parent | next [-]

That seems unlikely. There are many providers for open models on openrouter. It seems unlikely that they are throwing money away for each token they sell.

Also, there a good technical reasons for inference being much more efficient at scale.

lowbloodsugar 21 minutes ago | parent | next [-]

Sure. And there’s Lyft and Uber and plenty of others. And Grubhub and DoorDash and uber and how many others. And I don’t even fucking remember how many electric fucking scooter companies, I’m practically falling over scooters! I’m sure they aren’t earning market share by selling at a loss either.

dijit 4 hours ago | parent | prev [-]

The providers on OpenRouter serving open models aren't "throwing money away", agreed.

But that's not the point I'm making. (or, it kind of is, but it's more high level than that).

They're running spot and preemptible GPU instances (60-80% cheaper than on-demand), paying wholesale industrial electricity rates, and running at multi-tenant utilisation densities that make your MacBook look like a bonfire. Of course they're not individually loss-making on inference, they're aggregating cheap commodity compute and skimming a margin, and on paper that's what makes it seem like a good idea, certainly not a loss leader right?

But zoom out a bit; the entire stack is swimming in VC money. OpenRouter itself just raised at a $1.3B valuation backed by a16z. The Chinese models that now account for 36% of all tokens routed through the platform (DeepSeek, Qwen) are priced the way they are because Beijing-adjacent capital has decided market share matters more than margin right now.

So yes, technically no single party is "throwing money away" on each token; they're just all simultaneously subsidising different parts of the stack for strategic reasons. The floor price you're seeing isn't a stable equilibrium, it's a pile of investor money that hasn't entirely finished burning yet.

vlovich123 4 hours ago | parent [-]

> The floor price you're seeing isn't a stable equilibrium, it's a pile of investor money that hasn't entirely finished burning yet.

All that says is that it gets more expensive in the future as competitors exit the market and sustainability becomes important. That’s why Uber and Lyft were so cheap until they killed taxis. One major difference of course is that some models will remain largely good enough and the incremental cost of running will keep dropping to 0 over time since the hardware needed doesn’t get more expensive and is already purchased.

dijit 4 hours ago | parent [-]

I think we agree.

I only object to taking current prices as if they are perpetual prices.

NicuCalcea 4 hours ago | parent | prev | next [-]

The blog compares the cost of running Gemma4 31b, which on OpenRouter is offered by small no-name inference providers, not by frontier AI companies. It seems like a fair comparison to me.

pornel 2 hours ago | parent [-]

LLM generation is bottlenecked by RAM bandwidth and latency. You can get almost linear scaling by evaluating more prompts in parallel, because the GPU has nothing to for the relative eternity it takes to read all of the weights from DRAM for every layer for every token.

On Apple Silicon you can get 4x-8x more tokens per second if you run more queries in parallel (as long as your inference server supports it, and has enough spare RAM for more KV caches).

When inference is done at datacenter scales, when you distribute generation across multiple GPUs and have kernels carefully tuned to specific hardware, the compute vs DRAM bandwidth speed ratio gets absurd like 200:1. That's why everyone gives you batch inference at a steep discount.

brianwawok 4 hours ago | parent | prev | next [-]

So many more efficiencies possible at scale though. I cannot keep a local model 98% utilized 24/7, at least not with my current workload. A big cloud can. I can’t power my servers with DC, I have this AC to DV conversion nonsense. The list goes on.

visarga 4 hours ago | parent [-]

Besides fill factor being hard to match, there is also scaling - you can't scale local inference 10x for a spike, but you can with cloud inference.

rprend 2 hours ago | parent | prev | next [-]

This is not true. API tokens are not sold at a loss, and hardware gets more efficient over time, so serving inference on the same model gets cheaper. LLAMA 3.1 405B parameters was $6/$12/M tokens in 2024, but in 2026 that same model is $3/$3/M tokens.

The most intelligent model at a given time is much larger than the previous, which is why token costs for GPT5.5 are higher than 5.4. But you should expect that 2 years from now, serving a GPT5.5 sized model will be cheaper than GPT5.5 today. You should expect it to be even cheaper to get an equally intelligent model 2 years from now, because distillation techniques are effective at reducing the necessary parameter count for the same benchmark scores.

Groxx an hour ago | parent | prev | next [-]

https://old.reddit.com/r/GithubCopilot/comments/1tbb5bj/gith...

Seems to be on its way! I know of at least one person whose company is looking at a 20x increase, and afaict (from related looking around, nothing concrete tho) business accounts are missing some costs in the calculator so it'll likely be higher.

raincole 37 minutes ago | parent | prev | next [-]

You should probably take some stay-on-topic pills, as this article is clearly and unambiguously talking about open weight models (e.g. gemma 4), not the ones allegedly being sold at a loss (Opus, ChatGPT, etc)[0].

[0]: these API are not sold at a loss either, by the way. But it's a nice meme so let's just pretend they are.

OsrsNeedsf2P 4 hours ago | parent | prev | next [-]

The models have been dropping 10x in price for completing the same tasks, year over year. Even if you think Anthropic is losing money charging 10x more than everyone else for their 400B model, the prices will continue to go down based on model improvement alone

ianberdin 4 hours ago | parent | prev | next [-]

Do you have a proof? Anthropic’s CEO said they Are profitable. Same with OpenAI.

dijit 4 hours ago | parent | next [-]

Profitable for inference if you completely ignore training costs and that you absolutely must continuously train new models.

vlovich123 4 hours ago | parent | next [-]

Which is where your analogy breaks down and why you think you’re taking crazy pills. Inference is growing and selling the oranges in your analogy. Model building is growing the farm to sell larger, juicier more addicting oranges.

skippyboxedhero 4 hours ago | parent | next [-]

The same mistake was made with Amazon, and a million other tech companies in the early 2010s.

Amazon were losing money, they were losing money because were growing and spent all of their cash flow on growth. It wasn't merely regarded as a hopelessly unprofitable business, if was regarded as potentially fraudulent. The share price collapsed in 2014 because, some thought, the profit would never come, investing in growth was pointless, etc.

Last year Amazon made nearly $100bn in profit. Stock is up 20x from then...this is after AWS was known (everyone also that was a massive fraud, could never be profitable...we know it was printing from day one), after it was the world's biggest retailer, etc.

It is difficult to understate how consistently people make this mistake, not just individually but in aggregate. You see the same thing with restaurants, consumer products, office leasing, so many businesses. This is not to say that the future will happen any particular way but that what Anthropic and co are doing is obviously rational and based upon very real cash flow. Anthropic's growth in revenue is, I believe, unparalleled in modern corporate history. A slight difference in this case is also that the economics of training these models is improving exponentially over time.

dijit 4 hours ago | parent | prev | next [-]

Are ya fuckin' serious mate?

The restaurant next to the mines were profitable up until the moment the mines themselves shut down: one doesn't exist without the other.

You can't ringfence inference as "the profitable bit" and then hand-wave away the training. Without continuous training there is no inference product.

Claude 3 Opus isn't sitting there making revenue in 2026 - the thing is just deprecated. The moment you stop spending billions on the next model, your "profitable" inference business is on borrowed time until someone else makes it obsolete.

Maybe I made a mistake in my analogy... They're not growing a farm and then selling oranges. They're on a treadmill where stopping is death, and the treadmill costs $10bn a year to keep running.

atq2119 2 hours ago | parent | next [-]

> Without continuous training there is no inference product.

This claim deserves teasing apart.

Clearly, training is a Red Queen's race today. If a model provider were to unilaterally decide to stop training, they would very quickly lose market share to competitors with better models.

On the other hand, what if market and investment conditions change such that everybody has to stop training?

In that case, the models are still there and still as useful as they were the day before. So why wouldn't there still be an inference product?

energy123 25 minutes ago | parent | prev | next [-]

What's the point of these words and analogies when the only thing that matters is numbers. Gross margins of 20% versus 70% makes a world of difference (literally the difference between a company that's about to collapse and a multi-trillion dollar self-sustaining juggernaut) but in your world of words these two companies are the same thing.

vlovich123 4 hours ago | parent | prev | next [-]

> They're on a treadmill where stopping is death, and the treadmill costs $10bn a year to keep running.

You’re literally describing all companies. Google takes about $270bn/year to run. If they stopped spending that they’d die pretty darn quick. It’s also a description of working - unless you’d built up significant savings, if you stopped working you’re also going to die.

bjt 3 hours ago | parent | next [-]

> You’re literally describing all companies.

No, not quite. It really comes down to opex vs capex and the depreciation schedule for your investment.

Software development is typically categorized as capex, on a 3-5 year depreciation schedule. You assume the software you write today will be generating value for you that long.

If a big, expensive model training project only gives you value for a year or less, that is not like most companies.

vlovich123 2 hours ago | parent | next [-]

No, the IRS made that change a while back as part of the TCJA but that’s been reverted in the OBBBA. If you build something and never touch it, sure that should probably be capex you have to depreciate. But if you’re investing continuously in it over time, I don’t see how it’s anything other than opex - there’s nothing being depreciated because you’re constantly improving it. Automobile manufacturers don’t have to count their labor force as capex. Indeed I can’t think of any other industry where labor is capex.

But believing that the financials of a project are governed solely by how IRS rules force you to account for headcount is kind of silly.

> If a big, expensive model training project only gives you value for a year or less, that is not like most companies.

The model itself that gets built? Sure (although clearly the timelines are getting longer). However the important bit here is the research that got done along the way and the infrastructure built to make that model building process cheaper, better etc. all of that stuff sticks around but because it’s hard to appreciate externally you discount it to 0 when it’s literally what they actually spent the money on.

But none of that even matters. Google had 270B in opex and their capex has grown from 50B in 2024 to 90B in 2025 and is projected to grow to ~175B for 2026. But even if you discount the “AI” treadmill, you’re still looking at many tens of billions in capex that if they stopped they’d die.

Anon1096 2 hours ago | parent | prev [-]

Software that is sold as a service and requires ongoing maintenance like running in the cloud (and people to keep it running in the cloud) is opex not capex. Google Search is most definitely opex.

Danox 4 hours ago | parent | prev [-]

The problem is I don’t think computing is going back to the mainframe era you know where all the computing is done remotely and the only thing you have in front of you is a terminal that is the AI slop maker’s dream, the computing power on the desktop/laptop/tablet/phone is getting better and the models are getting smaller and quicker.

There is no moat. In the end, what we are calling AI today will just be something that is incorporated into an existing programs that people will use to help them accomplish a task. The public will not be paying more for it. It will just be a commodity added to the existing ecosystems we have today. They

genxy 2 hours ago | parent | prev [-]

> Claude 3 Opus

Unless they are changing the architecture in huge ways. The pre-training done for 3 goes into later models. I am sure the frontier labs are figuring out how to pretrain generic feedstocks that can be fed into downstream training pipelines. DeepSeeks incremental training run cost was what, 5M? Alibaba and DeepSeek have the best most efficient training pipelines, look at the rate at which custom Qwen models are being pumped out.

no-name-here 3 hours ago | parent | prev | next [-]

> Inference is growing and selling the oranges in your analogy. Model building is growing the farm to sell larger, juicier more addicting oranges.

In this analogy, model training would be akin to developing better oranges, but your competitors are also developing better oranges so if you stop spending heavily to improve your oranges, consumers are going to buy ~zero oranges from you within a couple years. (Expanding the farm might be analogous to expanding data centers.)

lowbloodsugar 11 minutes ago | parent | prev | next [-]

Last month Anthropic tried to control the narrative by drumming up the “super scary AI” trope.

The news they successfully buried was that companies like AirBnB are now running Qwen and open source models. The free oranges are now good enough. There is no future unless the goal is to get to super intelligence and utterly take over the world before anyone else gets one. Anything else and free models are six months behind. The money now is the opposite of what everyone thought a year ago: datacenters. Everyone thought AWS was fucked. Turns out AWS is really good at running Qwen.

xienze 3 hours ago | parent | prev [-]

In this particular case, inference and training are intertwined. It might be one thing if Anthropic could get away with training a new model every five years and control costs that way. But they can't. Put another way, their inference has no value without continuous, very expensive training. Because consumers aren't purchasing based on price but capability, otherwise the Chinese models on OpenRouter would have buried OpenAI and Anthropic already.

spzb 4 hours ago | parent | prev [-]

And ignore capital costs, depreciation, user churn etc

tiffanyh 4 hours ago | parent | prev | next [-]

Do you mind sharing source links to that profitability claim?

I’m struggling to find the quotes.

no-name-here 3 hours ago | parent [-]

Open AI: https://simonwillison.net/2025/Aug/17/sam-altman/

Anthropic: https://x.com/jaminball/status/2052112309364162874

initatus an hour ago | parent [-]

My reading on anthropic is that he strongly infers that they're profitable, realizes what he has said and immediately walks it back as explicitly not the case today and reframes it as a guess about some indeterminate point in the future.

> Those are the economics of the industry today, or not today but where we're projecting forward in a year or two.

Danox 4 hours ago | parent | prev | next [-]

AI CEOs are known to say many things telling the truth, probably isn’t one of them.

miltonlost 4 hours ago | parent | prev [-]

If only they had their books open to do more than just "say"

tempest_ 4 hours ago | parent | prev | next [-]

It is the model training that is dragging them down.

If the arms race stopped tomorrow the current price pays for the inference.

Danox 4 hours ago | parent [-]

But isn’t training models, a forever task like iterating in tech you can never take a day off, adding humans to the equation don’t humans train/teach themselves new skills over a lifetime, and isn’t one of the selling points in the future when selling this AI slop your AI never goes to sleep and can always be trained forever? The AI price for entry as we go on into the future will only increase.

atq2119 2 hours ago | parent | next [-]

I agree that training is a forever task, and the current rate of training is probably not sustainable. But all that means is that once the current investment mania ends, the market will most likely find a new equilibrium where continuous training still happens, but at a slower rate that can be sustained by inference revenue.

tonfa 24 minutes ago | parent [-]

> but at a slower rate that can be sustained by inference revenue. reply

also it's possible that the scale of inference needed (e.g. Jevons paradox) keeps growing to the point that training costs can fully be absorbed (since training cost is one off vs. inference that can scale).

(I suspect that might be the thinking, I don't know if it will be true, it's also possible that no model will create a moat big enough to attract enough of the inference traffic to make it true).

Depending on the chips/architecture used, the off-peak traffic from inference can also subsidize the training costs.

asjir 3 hours ago | parent | prev [-]

Just keeping it up to date with competitors is much cheaper, by copying better ones like Qwen did with Claude. Also a bunch of research is trickling into open source / arxiv so catching up should continue becoming cheaper at least as a fraction of training from scratch

visarga 4 hours ago | parent | prev | next [-]

> Frontier AI companies are selling at a loss.

There are huge economies to be had by batching requests and using lots of RAM for MoE (sparse models). You can't achieve that efficiency at batch size 1 on a single node.

asjir 3 hours ago | parent [-]

Exactly, they put a lot of money into engineering and it does give results

vlovich123 4 hours ago | parent | prev | next [-]

Except that’s not what the analysis is. They’re spending < $1 to get $1 from you and the other $9 to figure out how to improve the model further and build up products on top of that to turn that $1 spend into $5 in the future.

In other words, inference is fairly profitable for them and the rest of the money is spent growing revenue as quickly as possible. Building models is still an expensive line item but the costs for that are going down with time.

There is also maybe a “capture the market” mentality but I don’t think that’s necessarily it - the tools and processes are largely fungible and that’s a huge problem. They need to figure out how to make it sticky for “capture the market”, but there’s also a very real “grow as big as possible as quickly as possible to take on Google”; Google has an existential threat here.

poly2it 4 hours ago | parent | prev | next [-]

Well, I'd be surprised if non-R&D inference providers were selling at a loss. There are a plethora to choose from, competition is quite healthy. Will they keep providing cheap tokens while the labs raise their prices? Probably, but then I don't see how they could be raised in the first place. And what timescale are you talking about? A couple of years? It is appropriate to assume inference will become more efficient over time. If you raise your prices, you are going to be out competed before it's profitable (if you assume it is unprofitable) which would be negligent. I don't see how this makes sense.

throwatdem12311 3 hours ago | parent | prev | next [-]

The Michael Scott AI Companies.

EGreg 4 hours ago | parent | prev | next [-]

These models will become more expensive over time, it's functionally impossible for them not to, they just want to capture the market before they have to stop selling at a huge loss.

They could have said the same about transistors. People keep inventing new ways to keep the costs down. Just look at the latest Qwen, DeepSeek, BitNet. Interesting tidbit: they’re all open, and as Google said in 2022: they have no moat.

MattRix 4 hours ago | parent | prev | next [-]

The inference is absolutely not sold at a loss, at least not when paying API prices (the subscriptions are less clear). The reason frontier model companies aren’t profitable is because training the models is so costly, not inference.

MuffinFlavored 4 hours ago | parent | prev | next [-]

> Frontier AI companies are selling at a loss.

How big/deep of a loss?

I feel like I read this every day for years that Uber did this same "idiotic, losing" strategy (how it was pitched/discussed) and then one day we woke up and... without much fuss, boom, they were profitable seemingly overnight.

Danox 3 hours ago | parent | next [-]

As long as you have slaves/sharecroppers, driving the people at the top of the pyramid at Uber they’re profitable and Uber makes money as long as you don’t care about the workers and as long as you can get around all of the regulations that are put on traditional cab companies if there are any left on the road.

For me nothing says low class like the Porsche dealer saying we can call Uber for you to take you home ridiculous… and it was a low class experience dirty car small never again ha ha ha…

brianwawok 4 hours ago | parent | prev | next [-]

Well and uber cut the driver pay in half and doubled the price. They didn’t really find any efficiencies, robo drivers don’t exist yet. Also why I hardly touch them anymore.

onesociety2022 3 hours ago | parent | next [-]

All that tells me is they did find an efficiency. If they didn’t, their driver supply would have dropped. Unlike the taxi business, Uber/Lyft can tap into otherwise dormant supply of drivers who already own a car but aren’t willing to spend all 40-60 hours a week driving a taxi. With Uber/Lyft, they can become part-time drivers (they have flexibility and they can use an asset they already own anyway). Is it worse for the full time taxi drivers who used to have the supply artificially constrained in the old medallion system? Yes, but does it also benefit others who want to do this as a flexible job, zero skills required other than driving, no boss to deal with, no job interviews, etc. Yes!

MuffinFlavored 3 hours ago | parent | prev [-]

> Well and uber cut the driver pay in half and doubled the price

Devil's advocate:

* inflation caused everything to go up to some degree since then

* if it was "that bad" as you say, they wouldn't be extremely profitable and have so many users

both things can be true? "they cut the driver pay in half and doubled the price" did not lead to the collapse of the business/people to stop using it.

spzb 4 hours ago | parent | prev [-]

Ed Zitron discusses this as part of his post on AI economics : https://www.wheresyoured.at/ais-economics-dont-make-sense/

ajross 4 hours ago | parent | prev [-]

> I feel like I'm taking crazy pills.

Why? It's no less crazy than when Uber and Lyft were doing the same thing. Or when the entire tech industry was doing it in the dot com boom.

Investment-driven market growth at a loss is like the least surprising thing in all of this. The tech is new and fascinating. The bubble is just another trip through the funhouse.