Remix.run Logo
Aurornis 2 days ago

> We are slowly inching closer to the point where AI and AI products will be billed for what they cost.

I suspect the API prices are already served with profitable unit economics. The SOTA API prices are much higher than the costs for other providers to run very large open weight models.

The monthly subscription plans were being offered at a discount to generate interest in these models.

We're not entering a period of billing AI at cost. We're entering a period of exploring how how the prices can go before losing too many customers.

Products and services aren't sold at cost. They're sold at the price the market will bear. It takes some experimentation to find that equilibrium point where you make more profit per customer but don't lose too many customers.

malfist 2 days ago | parent [-]

> I suspect the API prices are already served at prices with profitable unit economics.

There is absolutely no evidence to support this.

selectodude 2 days ago | parent | next [-]

Some basic math supports it. A GB300 NVL72 is about $6.5 million. Lets say that you need $6 million worth of cooling and another $6 million worth of electricity. At current rates, that's 720 billion tokens worth of Claude Opus 4.7. At 100,000 tokens per second, it pays for itself in about 3 months.

Obviously this is an extremely rough calculation. I can even be off by a factor of 10 and it's still a pretty good return.

overfeed 2 days ago | parent [-]

Unless you're serving Chinese open-weight models - you have to consoder training costs. If you're off my 10x, then the amortization period is 30 months - far longer than the useful lifetimes of SoTA models. Frontier model development is a Red Queens race: you have to run as fast as you can, just to maintain your position.

selectodude a day ago | parent [-]

The discussion was if Anthropic makes money on inference. They do. They lose billions on training.

joshuastuden a day ago | parent [-]

No, because Anthropic can't serve their models unless they train them.

Training is akin to the cost of building the software/product. Inference is selling the product.

BadBadJellyBean a day ago | parent [-]

It's quite easy to sell something for a profit if you ignore the costs. Ultimate free money hack. I will start selling canned beans for the price of the beans plus a few cents. I will just ignore the cost of the cans, labor, power, machines, maintenance, distribution, storage and facility space. If I do that the few cents extra are pure profit.

speedgoose 2 days ago | parent | prev | next [-]

We don’t know the models sizes, requirements, and optimisations, but we could take a guess using the infrastructure costs of the largest open weight alternatives that perform slightly worse.

In my opinion, it’s a profitable kind of service. They probably don’t pay the public prices for the cloud GPUs though.

BadBadJellyBean a day ago | parent | next [-]

Just looking at infra cost is not enough. If the token price doesn't contain all the costs they are losing money and they eventually have to raise prices more.

hyperadvanced 2 days ago | parent | prev [-]

In my opinion it seems like a very unprofitable service propped up by investor money trying to capture market share.

Or, as I would say if I were Bugs Bunny, “Duck Season”

ofjcihen 2 days ago | parent [-]

Rabbit season!

Aurornis 2 days ago | parent | prev | next [-]

> There is absolutely no evidence to support this.

Analysts like Semi-Analysis have done a lot of modeling and estimates on the topic.

But two can play this game: There is absolutely no evidence to support that API prices do not have profitable unit economics.

Khalos a day ago | parent [-]

I'm not familiar with that analysis, its accuracy, or its evidence. I would be surprised by this given it seems like providers are still in the growth phase.

Typically the burden of proof is on the one making the claim.

Aurornis a day ago | parent | next [-]

https://semianalysis.com/

They have some of the best publicly available analysis on these topics. The full details and numbers are hidden behind the institutional accounts which are priced for investors (not something you sign up for personally) but they're generous with what they send out in their newsletter.

If you're not familiar with resources like this I could understand how you'd assume that the providers are hemorrhaging money on inference costs, because that is that story that gets parroted around spaces like Hacker News.

You could ignore all of that, though, and go check OpenRouter to see how much providers are selling high parameter count models. They're not entirely at the level of the SOTA models, but the biggest open weight models are not that far behind in complexity either. They're being sold an order of magnitude cheaper than what you pay for the APIs from the major players. We don't know exactly how big the major models are, but it's unlikely that they're more than 10X more compute intensive from the leaks we do have.

joshuastuden a day ago | parent [-]

[flagged]

Aurornis a day ago | parent | next [-]

If you’re demanding rigorous proof for only one side of an argument while assuming the other side must be true, you’re not interested in honest debate.

The cost of AI inference has been a heavily analyzed topic. I trust the professional analysts much more than the casual Hacker News commenter claiming they’re losing money per token because they’re repeating what they heard some other Hacker News commenter say

JohnHaugeland a day ago | parent | prev [-]

the claim that they suspect something is adequately backed up by saying “i suspect this”

nobody needs to prove their suspicions

JohnHaugeland a day ago | parent | prev [-]

there is no burden of proof on someone who says “i suspect”

winfredJa 2 days ago | parent | prev [-]

Mr.Truth-teller Amodei confirmed it that APIs are profitable at Anthropic.

Eufrat 2 days ago | parent | next [-]

I don’t think any of the AI model providers have produced any evidence to back their claims of profitability.

I want to see their S-1s, then we can fight.

disgruntledphd2 2 days ago | parent | prev [-]

He didn't, he talked very carefully in hypotheticals.