Remix.run Logo
crazylogger 2 days ago

I haven't seen anyone claiming that API prices are subsidized.

At some point (from the very beginning till ~2025Q4) Claude Code's usage limit was so generous that you can get roughly $10~20 (API-price-equivalent) worth of usage out of a $20/mo Pro plan each day (2 * 5h window) - and for good reason, because LLM agentic coding is extremely token-heavy, people simply wouldn't return to Claude Code for the second time if provided usage wasn't generous or every prompt costs you $1. And then Codex started trying to poach Claude Code users by offering even greater limits and constantly resetting everyone's limit in recent months. The API price would have to be 30x operating cost to make this not a subsidy. That would be an extraordinary claim.

nl 2 days ago | parent | next [-]

The claim that APIs are subsidized is very common.

eg:

Token prices are significantly subsidized and anyone that does any serious work with AI can tell you this.

https://news.ycombinator.com/item?id=47684887

(the claims don't make any sense, but they are widely held)

vessenes 2 days ago | parent [-]

I’ll note that it’s common and dangerous, in that there’s a generation of engineers who are at risk of leading each-other astray as to the economics and therefore probability distribution of outcomes for some firms that will massively impact their careers.

I think I understand the major reasons for this meme, but I find it really worrying; there were lots of incorrect ‘it’s a bubble’ conversations here in 2012-2015, but I don’t think they had the pervasive nature and “obvious” conclusion that a whole generation of engineering talent should just, you know, leave.

Meanwhile I am hearing rational economic modeling from the companies selling inference; Jensen, (a polished promoter, I grant you) says it really well — token value is increasing radically, in that new models -> better quality, and therefore revenues and utilization are increasing, and therefore contrary to the popular financial and techbro modeling of 2023, things like A100s still cost quite a lot whether hourly or to purchase. (!) Basically the economic value is so strong that it has actually radically extended the life of hardware.

I just hate to imagine like half of the world’s (or US’s) engineering talent quitting, spending ten years afraid, or wrongly convinced of some ‘inevitable’ market outcome. Feels like it will be bad for people’s personal lives, and bad for progress simultaneously.

mike_hearn 2 days ago | parent [-]

People shouldn't be quitting the industry, agreed. There's plenty of work to do even with AI assistance.

But how is that a counterpoint to tokens being subsidized? They obviously are subsidized, this just isn't arguable at all. The claims in the linked post make perfect sense. If they weren't subsidized the investors in AI labs would all be minting money instead of burning it.

It doesn't matter if token value is increasing. What matters is how fast it increases relative to the price increases, the repayments on the debt loads and other things we can't really know here on this forum.

Every attempt I've seen to argue this fact away is merely playing with numbers e.g. excluding every cost except inf hardware+energy, even though labs are always training and have large costs outside of compute. This might or might not be a good way to predict the future of these orgs, but it doesn't help anyone argue inference is profitable today (because inference is literally the only thing OpenAI/Anthropic sell and they lose money).

The whole computing industry is in a super weird place right now that feels temporary, like Wile E. Coyote spinning his legs suspended in mid air. Until the economics of the AI industry stop being driven by FOMO and weird, hard to interpret quasi-religious or geopolitical motivations, it's impossible to make accurate predictions about what the impact on software jobs will be. Historically a tech like this would have started at super-high prices and the token cost would have gradually fallen over a period of decades, giving people plenty of time to adapt. Look at the cost of flying, desktop computers, mobile phones, etc. AI is attempting to short circuit that normal technological path and pack decades into years by convincing capital holders that they have no choice but to "invest" because it'll be a winner-takes-all repeat of web search and social media. Yet it's not shaping up that way.

nl 2 days ago | parent | next [-]

> But how is that a counterpoint to tokens being subsidized? They obviously are subsidized, this just isn't arguable at all.

Why would Microsoft subsidize Anthropic's models when they serve the Claude model on Azure? They charge the same price as Anthropic. They aren't an investor in Anthropic.

There are numerous independent model serving companies that are clearly profitable serving non-Frontier models (Kimi K2.5 etc). It's easy to work out the raw costs of B200 GPUs, and then see what you need to charge for an API and see they make money.

The frontier labs charge a lot more than these companies.

The frontier labs have said they are profitable on inference.

Most people believe that training (and maybe subscriptions for some users) is where they lose money. Why do you think otherwise?

mike_hearn a day ago | parent [-]

Who says it's MS subsidizing those prices and not Anthropic themselves? Just because someone rehosts a model doesn't imply they get to set whatever price levels they want.

I don't think otherwise, I just think it's meaningless to differentiate between training and inference. What the frontier labs sell is inference. They can't just exclude costs required to engage in that business unless they plan a pivot to just serving Chinese models in a commodified market.

Yes, tokens for random no-name firms serving Kimi K2 probably do make money, although even there it's unclear because so many datacenters and GPU purchases have been made on credit etc. And if we assume that's sustainable forever then you can assume training/staffing costs should be subsidized to zero and say sure, token serving is profitable in that situation. But we were discussing the top labs.

vessenes 9 hours ago | parent | prev [-]

Hi Mike! Long time - super nice to see your name in my HN feed.

I’ll fight you on profit. The major labs are super profitable. If you replace “profitable today” with “cashflow positive today” then I think you’re correct. They are clearly not cashflow positive today. However, they are absolutely profitable, and when people confuse those I think it can be dangerous.

Consider a series of companies, let’s call these companies “Claude 1, Inc”, “Claude 2, Inc”, “Claude 3, Inc”, “Claude 4, Inc”.

In each company let’s keep track of the following:

* The pro-rata hardware and energy costs the company used during training. So, for instance, if a cluster is going to “last” 5 years, and we used it for 2, and the cluster cost $1 billion to build and provision and pay for 5 years of energy usage, we would charge $200mm.

* The R&D expenses like salary and so on

* The inference costs of every use of that company’s model.

* The revenue acquired in exchange for use of that model.

I propose first that I haven’t hidden any costs or double counted any revenue or anything - this is a full, fair assessment of the costs and likewise the revenue earned. I propose second that if you go to the end of the company’s final period then “profitability” in this case equals “cashflow”, so we can talk about either without talking past each-other. Third, I propose - if you add up all the costs and expenses of Claude 1 - 4, Inc, you’d have the full P&L of Anthropic, up to any training done on Claude 5.

I will now repeat a statement made publicly and repeatedly by Dario (and Sam in a slightly more cagey way): every single one of those “companies” (fully loaded models) has turned a profit so far. Put another way, it has, repeatedly, been a very good financial decision to train a model, and then sell inference of that model.

Why are the frontier companies spending cash? Simple - as each new model comes out, it’s quickly apparent that the new model will pay, and so increased training costs are incurred before that model has ended its useful life. Due to scaling activity, each new run costs some multiple of the prior run. Combining the overlap and the scale up, these companies are cashflow negative. But they aren’t doing it in some weird race to spend a dollar to make $0.50. They’re spending a dollar to make like $6 a year for a year or two.

If you see this, most of the ‘bubble’ (and implied massive crash) forecasts don’t seem to have any basis in reality from my perspective.

Frontier lab models are fucking great earners: 60%+ inference margins. (Public statements by said CEOs. Lateral proof: similar sized open models available for inference at 1/8 to 1/10 price on openrouter. Ergo - closed model margins are high). These earnings are real dollars, hard cash. Maybe the datacenters are in a bubble? After all, there’s a lot of debt getting laid on to do datacenter buildouts.

Datacenter companies and hyperscalars are making money providing hosting to these frontier labs. Coreweave (former ETH miner!) and others are posting 70% profit margins against debt costs under 8%. These profits are again in hard dollars from the labs. So, maybe the hardware providers are in a bubble?

Nvidia is making 70%+ margins, consistently beating every earnings call, is spending like $6bn a quarter on R&D against $40+bn in share buybacks (made in cash). They are moving super fast, and they could still literally be spending another 7x their current R&D spend before going cashflow negative. So, maybe the foundries are in a bubble?

TSMC is showing 66% margins (record high), and cutting Apple’s allocation to a point where there are research warnings about it. Maybe the EUV lithography companies are in a bubble?

ASML is the most generous company in the world, and is showing 34% operating margin this year while providing the only machines that can make the chips that TSMC and others are selling.

This is all very real. To my eyes the possible negative financial outcomes that seem plausible are:

1 - scaling laws stop working (and/or models get ‘good enough’) and all of a sudden the new hotness we just spend our entire last 5 years revenue on isn’t any better.

2 - There’s some major exogenous shift in demand for tokens and datacenter utilization drops radically, leading to credit defaults.

The main things that would have to be true would be that these things would have to be industry wide before they were a problem, and they’d have to end up with demand at less than 1/6 or so of current forecasts before they caused some kind of cascading financial problem: until then we’d see coreweave breaking even, reworking its debt covenants, spending less on power (unused), spending less on power (over capacity = lower prices on power being used), etc. etc.

This is SUPER long already, but to close, I think it’s reasonable and interesting to talk about those scenarios - how likely is it that scaling stops working or that people are okay with what we’ve got (that is, token value stops increasing in a compute-unitized environment)? How likely is it that people stop buying tokens at all even if their utility is stable or growing?

Agreed we’re in a temporary transitional phase right now, but I think it’s to a radically new business model and economic order more than it is a prelude to a giant debt leveraged crash, Wile E. Coyote style.

dannyw 2 days ago | parent | prev [-]

Yeah, subscriptions used to be extraordinarily generous. I miss those days, but the reinvigoration of open weight models is super exciting.

I'm still playing with the new Qwen3.6 35B and impressed, now DeepSeek v4 drops; with both base and instruction-tuned weights? There goes my weekend :P