It’s incredible how polarizing the AI rush is. I keep the perspective that the technology is an absolute step change but I have no idea where the cards will fall. I take a lot of issue with these style of articles. I get a sense that the authors are being overly defensive.

The cost to serve tokens is absolutely profitable today and that’s been true for at least a year. What’s unclear is how R&D and capex fit into the picture. I am not that pessimistic on this front either though. For the data center build outs, demand for tokens is still exceeding supply. On the R&D front, well most of us here on HN have benefited from decades of overinflated engineering salaries being paid by often companies that were not profitable and not only unprofitable, usually without a plan for success. In this current rush, companies cannot keep up with supply, it’s a much easier math problem when you have something that people want (tokens) and you need to figure out profitability when including R&D.

▲ Aperocky a day ago | parent | next [-]

Demand of tokens is absolutely skyrocketing.

And unlike the traditional "this will replace humans right away", I think what this introduce is a lot of incentive to spread those token in places where there was never any incentive to hire a software engineer for previously. In turn, that will drive a lot of business activity in those area that will potentially fail given the current quality of the output.

This feels like a boom before bust scenario, and I'm not even sure if it will bust.

▲ gdilla 3 hours ago | parent | next [-]

the busting will come from the token consumers. so many disasters waiting to happen.

▲ skeeter2020 a day ago | parent | prev | next [-]

Maybe we need to focus on a better definition of "bust" but we will surely see something along the lines of the hype-cycle graph in AI; what technology has not fallen into the trough before (best case) reaching a more steady-state of use and growth?

	▲	Aperocky a day ago \| parent \| next [-]
		It's also funny because a bust require a 2 quarters time. AI cycle can be faster than that? My workflow is certainly not recognizable compared to past for every 6 months for the past 24 months. Maybe we are closed to singularity, or maybe we'll just plateau somehow. But either case there are so much work to support the breakneck change that are not done because the change take priority every single time, there should be a lot of things to work on.
	▲	ido a day ago \| parent \| prev [-]
		cars? airplanes?

▲ WarmWash a day ago | parent | prev | next [-]

>potentially fail given the current quality of the output.

The question is how big the fail is if you measure it in 3 month increments going back to late 2022.

	▲	Aperocky a day ago \| parent [-]
		fails are beneficial to an economy. If there are no fails, you end up with Soviet Union. As long as there are more amount of success, then it should be net positive.

▲ hirako2000 a day ago | parent | prev [-]

Tulips sales also skyrocketed.

Seriously, what value are tokens providing other than justifying layoffs. Concretely. Today. Not in the speculating scenario that cardiologist could be replaced with models.

We see this new trend of agentic coding, again a promise software will be written that way going forward, despite the number of fiasco already experienced when trusting a model turned bad. The use case may provide value, but right now all it does is fullfil the push for token consumption all these AI leaders are advocating for.

▲ sempron64 a day ago | parent | next [-]

It's ridiculous to call this tulips, in the sense of a speculative asset whose price depends on resale. A more similar recent example is the dotcom boom and bust based on building internet infrastructure, or the 2008 crash which was based on cyclical infrastructure overinvestment. These crashes were characterized by demand growth not keeping up with investment because the target markets were tapped out. Not clear when we'll get there with AI. The consumer market seems saturated on chatbots but we're not even close to saturated for b2b or self driving for example. And this discounts other new technological offerings which may unlock larger consumer markets (products where people are willing to pay $100 a month instead of 10 or 20)

All that said the dotcom boom is extremely analogous and that crash was quite bad.

▲

skeeter2020 a day ago | parent [-]

dotcom was maybe 100B a year focused on the US and mostly VCs. AI is perhaps 250B global VC (with more than half of ALL VCs concentrated in one sector) and another 800B+ from non-VC. These numbers are basically a guess but structurally we are set up for something much, much worse.

▲

infecto a day ago | parent [-]

But unlike the dot com boom, demand for tokens has not let up and there is increasing demand. I don’t know where it falls, certainly companies don’t get or right and they either over or under build. With the current demand rate changes it’s hard to understand why you would stop building today.

▲

hirako2000 a day ago | parent | next [-]

Demands for tokens exists yes. On one side you have huge demand for the infinitely subsidized tokens so that people can post a "unique" illustration when posting on social media, along with the text itself even.

On the other end we have professionals happy to pay a subscription for heavier use, to build something in the hope to sell it.

I figured I don't believe in value when my dad explained to me his mate fired his team once he realised he could just pay 20 bucks for his Gemini account and run his business. I asked, do you call this value add? He said it must be, since he can produce the same output with no staff.

There is a confusion between profiting from a circumstance and value creation.

You create value if, say, you cure a disease. That it takes you an army of staff or extract maximum profit from it is just a feasibility formula.

That you make the cure more affordable is value creation.

That you cure the same disease but increase your profit doesn't create any value, except to yourself, for a while

▲

Aperocky a day ago | parent | next [-]

> I figured I don't believe in value

Maybe you don't, but it's fairly obvious that a lot of things are changing and things are moving.

Maybe your dad's mate didn't have to expand on his business, good for him. Other business are expanding because they now can.

Will the positive overweigh the negative? Not necessarily, but to go "it's tulip" is the kind of argument so devoid of nuance that we shouldn't be discussing so on HN.

The overwhelming demand for token would not coming from people wanting a unique illustration - it would be from professional usage. In fact, I'm not even sure who is subsidized. The $20 subscription surely isn't being used fully across all members of that subscription.

	▲	Throaway199999 a day ago \| parent [-]
		^^^^^

▲

arbitrary_name a day ago | parent | prev [-]

[dead]

▲

fsloth a day ago | parent | prev [-]

Yeah, this is the difference.

2000's tech bubble was caused among other things over-investment to infrastructure and technology that had no users yet.

Totally different setup.

Does not mean AI boom will not turn to bust, but weak analogues generally don't help with understanding complex systems.

▲ matheusmoreira a day ago | parent | prev | next [-]

> Seriously, what value are tokens providing other than justifying layoffs. Concretely. Today.

Claude helped me implement a ridiculous amount of features in my programming language. It's helped me migrate the heap to an easily moveable index-based object space. It's helped me implement generators. It's helped me implement a new memory allocator. It's helped me fix a ridiculous amounts of bugs and make a huge number of small improvements everywhere. Its ability to provide me repository wide code review was a game changer for a solo developer like me. And it's doing so much more than that. I got more things done in the past few weeks than previous months even though I'm evaluating, learning, understanding and rewriting the AI output.

It's actually addictive to build things with Claude. The usage limits are starting to make me anxious, just like withdrawal syndrome. I applied for their open source max subscription program even though I'm too small for it because who knows, I might get in anyway and it costs nothing.

AI is quite literally a world changing technology. I hope the open models keep steadily progressing and that hardware remains available to all so we can run our own models on our own computers one day.

▲

Aperocky a day ago | parent [-]

Just pay for it, think of it like college tuition.

Just far cheaper (if you are in USA) and probably more useful in terms of job prospects.

	▲	matheusmoreira a day ago \| parent [-]
		I am paying for it. I subscribed to the Pro plan about two weeks ago. The Max plan is a significant expense I can't justify. I straight up cannot afford the API prices as an individual.

▲ rhaen a day ago | parent | prev | next [-]

Tulip futures skyrocketed, it was economic speculation on a useless asset, not supply and demand. Crypto is the analogy, not AI. Given that the major AI labs other than GDM are private, this is even more true.

Agentic coding absolutely blew up from demand, users are not being tricked into paying $200 a month, and they’re not complaining about hitting rate limits because it’s useless.

▲ aurareturn a day ago | parent [-]

  users are not being tricked into paying $200 a month

I can't believe people actually believe that people and companies are tricked into paying for tokens. My $20 Codex subscription is so useful, I can easily see myself paying $200 for it.

This belief is so common amongst AI collapse people online. I'm guessing these people have only used free ChatGPT or worse, they use Windows and get Copilot shoved down their throats?

Meanwhile, I'm flying around with a $20 Codex subscription doing everything from writing code, analyzing stocks, coming up with ideas, etc.

▲

fsloth a day ago | parent | next [-]

I'm paying $20 for Codex and $90 for the Claude Max plan. They are a "pry from my cold dead fingers" product for me.

IMO if someone tried this tech last time 6 months ago, or their only exposure is eg. via MS copilot, they do have a rational reason for skepticism. No technology of this complexity has improved this rapidly in my memory (well, ok, we had the CPU speed races from 90's to early 2000's).

▲

BirAdam a day ago | parent | next [-]

The CPU speed race might be the most apt comparison I've yet heard.

From the 80486 to AMD Athlon64 X2 and much of that progress was enabled by better EDA being run on the more powerful CPUs being made with each improvement.

Now, we have better models helping to create even better models.

▲

tartoran a day ago | parent | prev [-]

Would you still pay if prices were to increase,say $1500-2000 monthly?

▲

aurareturn a day ago | parent | next [-]

Probably. I assume the value would drastically increase. Companies will definitely continue to pay for it. It's irreplaceable now.

▲

tartoran a day ago | parent [-]

How about if they plateau but prices skyrocket? Most companies would pay but if you're not working for a company that does pay for it, what's the line beyhond which you'd think twice about paying for it yourself? 500? 1000? 1500?

	▲	aurareturn a day ago \| parent [-]
		Why would price skyrocket? Let's say they have already plateau. But hardware continues to get better, right? So tokens should go down in price, not up. Since they're already 50%+ on inference today, better hardware would allow them to generate more tokens for less money. I would pay $500 to start, build stuff with it, then keep going up the tiers as the stuff I'm building makes money.

▲

fsloth a day ago | parent | prev [-]

Privately no. Professionally yes.

▲

aiedwardyi a day ago | parent | prev [-]

[flagged]

▲ gruez a day ago | parent | prev | next [-]

>Seriously, what value are tokens providing other than justifying layoffs. Concretely. Today.

It's adding tests for me and doing medium complexity refactors that I'd otherwise have to spend hours on

	▲	michaelcampbell a day ago \| parent \| next [-]
		Same, and constructing at least drafts of huge documents that I can iteratively fine-tune that have (at least last week) saved me 10's of hours. And based on reality (code) rather than my feelz of what I vaguely remember the code to have been doing in some long past.
	▲	datsci_est_2015 a day ago \| parent \| prev \| next [-]
		These example put in the category of “best IDE ever created, by a wide margin” - but not “replacement for the programming workforce”.
	▲	aiedwardyi a day ago \| parent \| prev [-]
		[flagged]

▲ infecto a day ago | parent | prev | next [-]

I know there is a large force on HN that want to deny the value of tokens and I know it’s anecdotal but the writing is on the wall. If it’s not valuable to your workflow today it will be soon. I already have tests being written, automated hooks into bugs where an initial PR gets generated with a potential fix. It’s far from perfect but junior engineers are far less productive.

▲

rileymichael a day ago | parent [-]

> there is a large force on HN that want to deny the value of tokens

there is an even larger force on HN that financially _needs_ the value of tokens to be inflated (so much so that bots have overwhelmed the site)

▲

infecto a day ago | parent | next [-]

That’s not me. I am simply an engineer who gets a ton of value out of these tools.

▲

therealdrag0 a day ago | parent | prev [-]

Really? Do you think there are more bots and employees of AI stakeholder companies than there are vanilla engineers on the site?

	▲	rileymichael a day ago \| parent [-]
		by far. at this point there are very few tech companies without exposure to AI

▲ jdmoreira a day ago | parent | prev | next [-]

are you seriously comparing AI to tulips? I don't even know what to say. Even if you are very bearish about the technology certainly you can't be this detached from base reality. Yet here we are

▲ somewhatjustin a day ago | parent | prev | next [-]

> Seriously, what value are tokens providing other than justifying layoffs.

Coding, writing, summarizing, translating, data analysis, customer support, test generation.

▲ senordevnyc 17 hours ago | parent | prev | next [-]

In the last eight months, my solo SaaS has gone from $0 to $325k ARR, and growth is accelerating. I run tens of billions of tokens per month through automated pipelines for the product itself (which replaces an ultra niche human-driven process in a very non-technical industry), plus probably low billions more per month for coding, systems operation and management, data analysis, etc. And I feel like I'm just barely scratching the surface of what today's models are capable of.

▲ vekker a day ago | parent | prev | next [-]

> Seriously, what value are tokens providing other than justifying layoffs

Like the OP said, it's incredible how polarizing this debate is. When I read comments like yours, I feel like a significant part of the global workforce in IT must be living on another planet? Or they never really used Claude Code, Codex, OpenCode, ... intensively before because of company policies?

I legitimately am at least 10x more productive than a year ago, and I can prove it in number of commits and finished monetizable features developed per day. Obviously my workflows still very much require an active, constantly context-switching human-in-the-loop, but to me there's absolutely no question both output volume & quality have skyrocketed.

▲

classified a day ago | parent | next [-]

> 10x more productive

That claim is totally worthless without you providing concrete information how you measured that.

	▲	hirako2000 a day ago \| parent \| next [-]
		And that's my point about value. That engineers can spit out far more code, or that they don't have to think much is surely precious convenience. Value add so far lacks evidence. Layoffs. It justifies them to the public. I'm not certain it grants them as it contradicts a principle of enterprise: scale, as much as you possibly can. If tokens provided value today, we would be hiring more engineers to review their output and put things together.
	▲	vekker a day ago \| parent \| prev [-]
		I literally wrote how I measure this in the post you are replying to: #commits which is admittedly a worthless proxy for productivity, so, more importantly, number of finished production-ready features delivered. That number is at least tenfold of what it was before, simply because I can run a lot of gruntwork in parallel now without wasting brainpower and focus on that stuff.

▲

Throaway1975123 a day ago | parent | prev [-]

I created 5 websites this year and am working on 3 prototype games. For free. Without any knowledge of coding beforehand.

▲

hirako2000 a day ago | parent [-]

Value?

There are millions of other wanna be engineers doing exactly the same, assuming demand will scale as much as the offer.

What returns are you getting on those?

Let me create 500 websites, deployed for free, I hand that over to you by end of day. Will you give me a cent per piece? If so, happy to do business with you.

▲

Throaway1975123 a day ago | parent [-]

The value is obviously to the people who will use this to replace engineers.

I would happily pay $200 a month for this. Luckily I dont need to, it's free.

Literally every game and website that I would have had to pay someone else to make I can now make myself. There's no value in that?

A year ago the best free LLM couldn't even give me a basic gridmap and collision. Now it can give me a full RCT style prototype & editor in 20 iterations.

I can only imagine what improvements we will have NEXT year!

▲

hirako2000 a day ago | parent [-]

> luckily I don't have to. It's free.

Ponder that for a minute.

There are over 2 million games, for Android alone.

That you weren't making games before the advent of LLMs makes it cool for you to build, and at no cost. But people have been able to make games without them and already grew the market to saturation.

If the outcome of LLMs is that we get more games, it won't imply that people will consume more games. Most games never get played anyway.

▲

Throaway199999 a day ago | parent [-]

There's nothing to "ponder" as you so patronizingly put it, and your stats on gaming are self-evident.

Op never said they're selling games. They said they're making their own games and websites for a fraction of the cost (even $0). That's amazing value. And it's just getting better.

▲

hirako2000 a day ago | parent [-]

that $0 is meant to go on the side of the value add that justifies the sort of funding we are seeing?

I didn't mean to patronize, sometimes self evidence isn't trivial to notice.

	▲	Throaway199999 19 hours ago \| parent [-]
		The funding is in anticipation of AI becoming so good that mistakes are only seen in the most complex output. In consumer applications, it's hard not to see that happening, given the exponential improvements of the past year. Whoever gets there first can capture the market.

▲ 3yr-i-frew-up a day ago | parent | prev | next [-]

[dead]

▲ Throaway1975123 a day ago | parent | prev [-]

Tulips had literally no economic value. LLM's do.

▲

drakythe a day ago | parent [-]

I say this as someone who has used them to boilerplate/scaffold a bit of code by this point: Economic Value of LLMs is debatable, if only because they're being too broadly applied.

▲

Throaway1975123 a day ago | parent [-]

Debatable sure. Not 0. Tulips are 0. They add nothing to anyone's output. LLM's are not. LLM's are not tulips.

▲

skeeter2020 a day ago | parent [-]

This is changing the narative. Nobody really cares about tulips and some dumb throwaway comparison. Unless LLMs are worth an awful lot the math here does not make sense. That is both debatable and important.

	▲	hirako2000 a day ago \| parent \| next [-]
		Since I brought up the tulips: People do care about Tulips. They do have value. So do LLMs. How many people will remain willing to pay for them, and how much, is what we call speculation.
	▲	Throaway1975123 a day ago \| parent \| prev [-]
		No it isn't changing the narrative. Tulip bulbs were a huge bubble based on speculation, completely. No one ever used a tulip to create a piece of software, or anything else. Their economic value was precisely 0. The whole thing was based on a bubble. LLM's may be IN a bubble, but they aren't tulips.

▲ boriskourt a day ago | parent | prev | next [-]

> The cost to serve tokens is absolutely profitable today and that’s been true for at least a year.

> For the data center build outs, demand for tokens is still exceeding supply.

Can you provide any numbers for this?

▲

wongarsu a day ago | parent | next [-]

I can get Kimi K2.5 inference on openrouter for about $0.5/MTok input + $2.5/MTok output, from six providers that have no moat besides efficiently selling GPU time. We can assume they are doing so at a profit (they have no incentive to do this at a loss), giving us those numbers as the cost to serve a 1T-a32b model at scale.

Now we don't know the true size of any of the proprietary models, but my educated guess is that Sonnet is in about the same parameter range, just with better training and much better fine tuning and RLHF. Yet API pricing for Sonnet is $3/MTok input + $15/MTok output, exactly six times as expensive. Even Haiku is twice as expensive as Kimi K2.5.

I find it difficult to believe in a world where those API prices aren't profitable. For subscription pricing it's harder to tell. We hear about those that get insane value out of their subscription, but there has to be a large mass who never reaches their limits. With company-wide rollouts there might even be a lot of subscription users who consume virtually no tokens at all.

▲

yobbo a day ago | parent | next [-]

> We can assume they are doing so at a profit

This is false. We may assume it's the most efficient way of generating revenue given their GPUs, but their overall profitability will just be a guess. They would still have incentives to run hardware at maximum, even when it's uncertain to eventually recoup costs.

> a world where those API prices aren't profitable

A lab with employees and models in training has other costs than the operating expenses of a GPU farm.

	▲	aurareturn a day ago \| parent \| next [-]
		Why would a company sell inference on Openrouter if they're not profitable? Except for Grog/Cerebras and a few other hardware companies looking to showcase their new chips. If they're losing money and have no VC backing, they'd just turn off the lights.
	▲	financltravsty a day ago \| parent \| prev [-]
		The actual inference is operated at a 95%+ margin.

▲

FiberBundle a day ago | parent | prev | next [-]

This is like saying that innovative medical drugs could be sold at a profit if only there was no patent protection and the innovative companies would still invest in R&D. Yes, on a token level pure inference costs might be profitable, but the frontier Ai labs will surely have to recoup their R&D investments at some point.

▲

jerojero a day ago | parent | prev | next [-]

Companies doing foundational models need to cover the cost of training which is much more expensive than training something like kimi.

▲

wongarsu a day ago | parent | next [-]

Yes. I would not consider Kimi a particularly good model relative to its size, and making a SotA model is a lot more expensive. But training costs are explicitly excluded when talking about the cost to serve tokens

▲

gruez a day ago | parent | prev [-]

>Companies doing foundational models need to cover the cost of training [...]

But that's moving the goalposts? The original claim was on inference itself, not the whole company.

> The cost to serve tokens is absolutely profitable today and that’s been true for at least a year.

	▲	lbreakjai a day ago \| parent [-]
		But that's the same as thinking "This bar is selling a cocktail for $15. I could make it at home for 30 cents. They're making $14.7 dollars of profit per cocktail, the owner must be a millionaire now!" Everything is profitable if you ignore the costs.

▲

ZitchDog a day ago | parent | prev | next [-]

> they have no incentive to do this at a loss

Are you sure? Surely there is a lot of interesting data in those LLM interactions.

	▲	wongarsu a day ago \| parent [-]
		Many of them are promising not to store any of this. Of course we have to trust them, for all we know they are all funded by various spy agencies

▲

KallDrexx a day ago | parent | prev [-]

The problem I have with this analysis is it's missing the multi-dimensional aspect of "is this profitable".

It's fair to say that if all these operators are competing for tokens, that the OpenRouter token operator (not sure the exact phrase but the people running the models) are accounting for some level of margin.

However, how many of these are running their own data centers and GPUs?

If they are running their own infrastructure, then it's not a simple equation of if each specific token set is profitable, since it needs to account for the cost of running the data center. It could be that they believe that it is profitable in the long term by utilizing the long tail of asset depreciation, but that isn't guaranteed.

IF they aren't running their own infrastructure, then it's much easier to claim that it's profitable and has a margin (outside of running their servers to manage the rented infrastructure).

HOWEVER, a lot of data centers have some pretty crazy low prices for GPUs that may be vying for user base and revenue over profitability. In these cases, if data center growth starts slowing due to slower buildout then it's very likely GPU prices go up and inference stops becoming profitable for the open router owners.

So long term it's not clear how profitable even these open models are.

OpenAI and Anthropic definitely fall into the latter category too. Their infrastructure requirements are much higher than the open models, and they are being given huge discounts so Microsoft/Amazon/Google can all claim revenue (since they have profitability coming from other parts). It's not clear if OpenAI and Anthropic models would be profitable at inference if they were paying rates that cloud hosts would make a profit from.

There's just way too many dimensions to this scenario to flat out state that open router proves inference is profitable at scale.

▲

ACCount37 a day ago | parent | prev | next [-]

Check the token prices for open weight LLMs at various independent inference providers.

That gives you a very good estimate of "how much can you serve the tokens of a model of the size N for while making a profit".

Now, keep in mind: Kimi K2.5 is 1T MoE. Today's frontier LLMs are in the 1T to 5T range, also MoE. Make an estimate. Compare that estimate with the actual frontier lab prices.

▲

lolc a day ago | parent [-]

I don't think it's as easy as looking at open weight API prices. We don't know whether the operators are making a profit on all the hardware they bought. Maybe the prices we pay just cover electricity. And it's not even certain that running costs are covered by API prices: The operators may be siphoning content and subsidize from selling that.

In the current volatile environment, the API prices are more of a baseline where we can assume it can't be much cheaper to operate these models.

	▲	aurareturn a day ago \| parent [-]
		That doesn't make sense in this environment because everyone is compute constrained with huge backlogs they can't fulfill. If these inference providers aren't making any money, they'd simply sell their GPUs to those who are starved for compute.

▲

bob1029 a day ago | parent | prev | next [-]

https://www.cerebras.ai/blog/cerebras-cs-3-vs-nvidia-dgx-b20...

▲

infecto a day ago | parent | prev | next [-]

Most/all private labs have cited inference is profitable. This was happening before the large push to scrap plans and largely charge folks the underlying api rates. Second take a look at the pricing of open models. Now certainly it’s not direct 1-1 comparison but we can use it as a baseline. Now of course folks might not be telling the truth but one of those situations where I see too many markers on the true side.

For supply look at outages and growth rates at companies like openrouter. The demand is growing every week.

▲

paulddraper a day ago | parent | prev | next [-]

Anthropic has said inference is profitable. That’s a biased source, but the math pencils.

This is why switching to local open weight models saves a lot of money. (Even though it’s not apples to apples.)

▲

drakythe a day ago | parent | next [-]

Anthropic also recently tweaked their usage limits to discourage use during peak hours. Why would they do that if inference was profitable?

	▲	infecto a day ago \| parent \| next [-]
		Don’t confuse inference (api usage) with the consumer plan products. When people say inference is profitable they are referring to the cost to serve a token via the API. The consumer products are absolutely a question mark on profitability and as we see with most of the business and enterprise plans, going away for pure on demand use (api cost) full time.
	▲	strangegecko a day ago \| parent \| prev \| next [-]
		Profitability doesn't imply infinite ability to scale. Of course they will want to prioritize their most profitable customers when they hit capacity issues.
	▲	aurareturn a day ago \| parent \| prev \| next [-]
		They do it because their demand is higher than the compute that they have available to them. Their GPUs must be melting during peak hours so they're encouraging people who move their workload to off peak hours if possible. This is the opposite of an AI bubble burst.
	▲	paulddraper a day ago \| parent \| prev \| next [-]
		Those are subscription plans. They tweaked the limits/periods included in the subscription. Having higher limits for subscription plans didn't give them any more revenue.
	▲	financltravsty a day ago \| parent \| prev [-]
		Their infra team is very understaffed and they are reacting to the public backlash of "no 9s?"

▲

nyeah a day ago | parent | prev [-]

Can you give a few penciled numbers?

▲

paulddraper a day ago | parent [-]

You can rent a H100 GPU for $4/hour. [1]

300k tokens for that hour.

OpenAI charges $6.

Those are pessimistic assumptions.

[1] https://lambda.ai/instances

▲

hajile a day ago | parent | next [-]

Can you keep that GPU 100% saturated at least 16 hours per day every day of the week?

If not, you aren't breaking even.

▲

paulddraper a day ago | parent [-]

Note this is also assuming you

(1) Rent your GPUs.

(2) Pay list price, no volume breaks.

(3) Get only 85 tokens/sec. Realistically, frontier models would attain 200+ tokens/second amortized.

Inference is extremely profitable at scale.

	▲	aurareturn a day ago \| parent [-]
		Assuming 80GB H100 and you inference a model that is MoE and close to the size of the 80GB VRAM, you're going to see around 10k tokens/second fully batched and saturated. An example here might be Mixtral 8x7B. You're generating about 36 million tokens/hour. Cost of Mixtral 8x7b on Open router is $0.54/M input tokens. $0.54/M output tokens. You're looking at potentially $38.88/hour return on that H100 GPU. This is probably the best case scenario. In reality, inference providers will use multiple GPUs together to run bigger, smarter models for a higher price.

▲

drakythe a day ago | parent | prev [-]

3.99 at 8x instances, with a minimum 2 week commitment. Good luck getting 70% usage average during that time. Useful when you're running a training round and can properly gauge demand, not so great when you're offering an API.

▲

infecto a day ago | parent [-]

Is it not a good penciled number? It helps set the directional tone that at inference cost is being covered.

	▲	drakythe a day ago \| parent [-]
		It says the numbers are theoretically possible. Requiring a 66% usage to break even when 100% usage will piss off customers by invoking a queue means it’s a balancing act. “Technically correct. The best kind of correct”. So inference may technically be _capable_ of being profitable, but I have question’s about them being profitable in _practice_.

▲

a day ago | parent | prev [-]

[deleted]

▲ iterateoften a day ago | parent | prev | next [-]

According to open router token demand is growing at something like 10% a week

It’s insane

▲

infecto a day ago | parent [-]

I wish this was higher up. I have been tracking the same since Thanksgiving ‘25 and the growth is unreal. Again I don’t know where the cards fall maybe the industry overspent on capex but it’s at least easier to see why they are spending based on demand. The risk of being left out is greater than overbuilding.

▲

coldpie a day ago | parent [-]

I do wonder how much of the apparent demand is driven by companies automatically running these things when users didn't actually ask for it. For example every web search I make now has an AI response that I scroll right past. I'm sure that counts for someone's token usage data, but I got zero value from it. This is happening in almost every software product now.

	▲	irke a day ago \| parent [-]
		Tokens as a metric is the analogue of users as a metric. In the end value per user is what matters in relation to being a healthy going concern and valuation in relation to Meta for example. Value per token is what should matter too - after all that’s what people are paying for.

▲ chrisweekly a day ago | parent | prev | next [-]

> "decades of overinflated engineering salaries"

'Overinflated' relative to what? You make some good points but I don't accept this as a premise.

▲

schmidtleonard a day ago | parent | next [-]

Overinflated relative to the wet dreams of the ownership class.

▲

gruez a day ago | parent [-]

It's not exactly stuff of "wet dreams of the ownership class" to say that of the possible white collar careers, software engineering is pretty hard to beat in terms of salary vs work you need to put in.

	▲	schmidtleonard a day ago \| parent \| next [-]
		This is a story of other careers having salaries pushed down relative to inflating essentials and the resulting economic surplus being squeezed into asset portfolios. It's a story of rich people getting paid for being rich in proportion to how rich they are and soaking up more than 100% of economic growth for the last 50 years. Not a story of software engineers working for a living and getting what would have been a blue-collar salary for it.
	▲	mono442 a day ago \| parent \| prev \| next [-]
		Average salaries for software engineering seem higher compared to other professions because the jobs are mostly in the most expensive to live cities. There's no swe jobs in smaller towns but they're jobs for e.g. accountants.
	▲	financltravsty a day ago \| parent \| prev [-]
		Compensation is usually more tightly coupled with leverage, rather than "work."

▲

fcarraldo a day ago | parent | prev [-]

Well, not GP, but I do. Let’s look at the numbers:

Median senior SWE salaries in SF: https://www.levels.fyi/t/software-engineer/levels/senior/loc...

Median income in metro areas: https://www.cnbc.com/2024/07/11/the-median-salary-for-the-25...

Engineering salaries are significantly higher than nearly every other industry on average and on median. Much of this is driven by VC funding rather than sound, profitable, bootstrapped businesses with sustainable profit margins.

Engineering salaries have also been driven upwards significantly the past ~10 years (since the post-2008 crash recovery), while wage growth in the US is mostly stagnant. I don’t have a source handy for that, but there are plentiful studies.

Outside of the US this may be less true, but I took GP’s “most of us on HN” to mean people who work in US tech companies which are primarily concentrated in high COI areas.

▲

marcyb5st a day ago | parent | next [-]

Isn't salary a proxy of how hard to replace one person or a group of persons is or how valuable they are?

There was a surge in demand for SWEs and scarcity brought salaries up. Are them too high? Hell no. On average, my colleagues and me generated ~2M$ each in 2025 for our company, while we get payed a fraction of that (grants and bonuses included). If you look at net income per employee we are at around 700k each in 2025.

Additionally, employers try their hardest to drive costs down (eg. offshoring as much as possible, everyone doing layoffs at the same time, ...) and average/median salaries remained high. If the salaries were overinflated those numbers should have came down I believe. The fact that they didn't makes me think that it still is a scarcity problem not an overinflation one.

▲

gruez a day ago | parent [-]

>There was a surge in demand for SWEs and scarcity brought salaries up. Are them too high? Hell no. On average, my colleagues and me generated ~2M$ each in 2025 for our company, while we get payed a fraction of that (grants and bonuses included). If you look at net income per employee we are at around 700k each in 2025.

So by that logic, housing in coastal cities also aren't "overinflated"? After all, like SWEs, they're they're also scarce and in demand. They're also providing enormous value to the people buying/living in them, otherwise they'd be living in Oklahoma or whatever and paying a fraction of the cost.

	▲	marcyb5st a day ago \| parent \| next [-]
		Maybe we give different meanings to the overinflation word. I see it as something that is speculative/shady in nature. Is housing overinflated? Probably in some places for sure because those who already have a house or invested in real estate wants to cut down supply to raise prices. Is the same on the job market? I don't think so. I never heard any SWE saying "let's scare people away from a CS career so we can bargain for higher salaries". The opposite is true though. Companies participate in career fairs, pre-uni events to make people gravitate towards a CS careers, ... so with a higher supply each employee loses a bit of bargaining power. Small excursus, this very fact was taken to the extreme in 2022 when everyone did layoffs at the same time despite the numbers being still great. If you put 300k people on the street at around the same time you can hire some of them for way less money as they now lost all leverage (since there are other 299.999 people waiting in line for a job).
	▲	B56b a day ago \| parent \| prev [-]
		Ya, that sounds right to me. Coastal city housing is very supply constrained, part of why it's so expensive, but it is hugely in demand and provides tons of value to many by letting them live near high paying companies. Unless by "overinflated" you mean a constrained supply/demand curve?

▲

rileymichael a day ago | parent | prev | next [-]

> Engineering salaries are significantly higher than nearly every other industry on average and on median

now compare the profit per employee at tech (software engineering) companies and those industries..

	▲	fcarraldo a day ago \| parent [-]
		At the top end (say, top 100 tech companies) it’s pretty high indeed. Public companies, for sure, as otherwise their stock price would tank. It’s not uncommon in this industry to have margins above 70-80%. But there are thousands if not tens of thousands where the profit per employee is minimal or negative. I can’t find a source for all tech (the data wouldn’t exist for private firms anyway) but I think it’s telling to look at this list, scroll down to about the middle and look around at salaries you or your colleagues are pulling. Software revenues are certainly high but the industry is afloat because of these high margin businesses creating returns so that low margin businesses can exist. Without the massive infusion in upfront capital, very uncommon in other industries, it’s simply not sustainable. Typically a market that’s buoyed by its top performers but has significant amounts of capital tied up in under performers is called “a bubble”. https://www.trueup.io/revenue-per-employee

▲

infecto a day ago | parent | prev [-]

Thank you for saying it better than I could have. It’s probably an unnecessary jab but I know how well I benefited financially in an industry where not much was expected in terms of output, lavish perks and huge base salary and stock compensation. Absolutely some companies are extremely profitable per headcount but I look at the sea of failures and how well engineers have generally done. It sets the tone for this massive negativity I see around AI when so many of us have benefited from VC money that failed.

▲ malfist a day ago | parent | prev | next [-]

> The cost to serve tokens is absolutely profitable today

How can you possibly say that? Everyone knows that's not the case, these companies are losing money every day selling tokens. Revenue is not the same thing as profit.

▲

infecto a day ago | parent | next [-]

Don’t confuse what I say. Bottom line these companies are not profitable yet but it is profitable to serve a token via the API. They have increasing demand, not enough supply, models are getting better on quick timelines. For sure there may be some losers but it’s not hard to see that that token serving can be a profitable activity.

	▲	scrollop a day ago \| parent [-]
		This is a claim in the wind until you provide evidence.

▲

jeromegv a day ago | parent | prev | next [-]

Yep, especially if we look at what happened just last week, both Google and Anthropic have dropped how much you get out of your existing plans.

	▲	infecto a day ago \| parent \| next [-]
		Don’t confuse plan changes with profitability of inference. When people talk about the cost to serve a token and it being profitable we are referring to API cost not the plans which absolutely subsidize some level of use. Hard to know what is breakeven on plan math.
	▲	surajrmal a day ago \| parent \| prev [-]
		That's not a necessarily profitability thing as much as a demand thing. The only way to improve the supply for those willing to pay more is to take it away from those paying less. Once supply catches up to demand things will change

▲

dist-epoch a day ago | parent | prev | next [-]

There are private companies which rent/buy GPUs, run open-weight LLMs on them and sell the tokens. They absolutely make profit, and their clients think they get a good deal and are buying the tokens.

▲

naravara a day ago | parent | prev [-]

I think they’re losing money because they have to amortize the costs of training the models in the first place, which is where most of the resource sink is.

This is why they were freaking out about DeepSeek just taking the trained model weights and slapping an interface on it.

▲

malfist a day ago | parent [-]

Thats like saying a restaurant is profitable because they're making money selling meals if you ignore the costs of ingredients.

Of course they are profitable if you ignore their cost to bring a product to market.

▲

infecto a day ago | parent | next [-]

The problem with that comparison is restaurants largely don’t have much room to adjust price or optimize cost. The AI industry is too new with many unknowns right now so investors are willing to take risk. For the hyperscalers the bet is that being left out is going to be a greater loss than overbuilding.

▲

naravara a day ago | parent | prev [-]

That’s the wrong analogy. Model training is more like the setup costs of developing the menu and training staff. What’s driving the costs is important when talking about financial sustainability. If it’s mostly coming from optional R&D investments instead of the direct costs of producing the food then you can simply not exercise the option and be profitable. If it’s more coming as a variable cost that scales with each meal served that’s a very different situation.

Yeah it should be factored in, but it’s a different set of implications for long term sustainability. They don’t actually need to test and optimize a new menu every day or week. If they decide to just stick to the same one longer they can get way more return from each dollar spent on development. It’s just that right now the rate of improvement you get with training is really high and nobody can afford to fall behind their competition.

▲

malfist a day ago | parent [-]

These companies are continually training new models. This is not a long term amortized cost, it's actual COGS.

Yeah, sure you can ignore the cost of purchasing the building for the restaurant for most profitability calculations, but if every year or two you were tearing down your old building and building a new one, you better believe that has to be in your profitability calculations.

	▲	s1artibartfast a day ago \| parent [-]
		I think the relevant question to define the counterfactual is what would happen if they stopped training. If you can simply not remodel your restaurant and keep making money, then yeah, it makes sense to call it profitable.

▲ Tade0 a day ago | parent | prev | next [-]

My main worry is - once this is all over, the market consolidates and using LLMs will become a requirement in job listings, what's the highest price per million tokens companies will be able to charge us?

Currently on a given day I'm chewing through approximately the equivalent of my lunch money, but where there's opportunity to extract wealth, someone will find a way to do it.

▲

h14h a day ago | parent | next [-]

My (potentially naive) take is that open models will save us. The biggest markets for LLMs (e.g. coding) are narrow-enough to be served well by smaller models with proper RL. Cursor's Composer 2 (created from a Kimi K2.5 base) is a great example, and I expect it to be the first of many.

The wealth of great open models provide an excellent base for fine-tuning, distillation, and RL. I see a lot of untapped potential in the field of bespoke, purpose-built models that can be served far more cheaply than the frontier competition. I would not be surprised if we see frontier-adjacent experiences running comfortably on a Mac Mini by year end.

With frontier models seemingly hitting diminishing returns in quality, I struggle to see a world in which gigantic, expensive, general-purpose models don't become increasingly niche.

▲

bluedays a day ago | parent | prev | next [-]

It's already a job requirement for a bunch of places, they're just not listing it. I lost out on a job recently because I haven't used cursor ai.

	▲	Tade0 11 hours ago \| parent [-]
		My friend has been looking for a job for the past few months and the other day he was given an HR LLM Agent to talk to. He contacted the company saying he's not going to do that, to which they replied something among the lines of "sorry, that's our process".

▲

dist-epoch a day ago | parent | prev [-]

Jensen is already talking about $1000/mil tokens soon.

But there is no real higher limit. Imagine a LLM which could answer the question "what does my company need to do to beat the competition?". And then realize that the competition asks their LLM the same question. So now everybody is bidding the price up or using more tokens to get a better answer

▲

Tade0 11 hours ago | parent | next [-]

This is the kind of bullshit I'm worried about the most.

Nvidia has a de facto monopoly in the datacenter-tier GPU market. He says this sort of stuff because he knows he can keep jacking up prices, because the cost will be transferred onto consumers - mainly software engineers.

	▲	dist-epoch 9 hours ago \| parent [-]
		If those software engineers keep buying it must mean they get more value out of it than they pay, right?

▲

irke a day ago | parent | prev [-]

Complete nonsense.

In that world there’s no reason for a business enterprise to exist.

▲ SirensOfTitan a day ago | parent | prev | next [-]

This is a classic HN mistaking the map for the territory. R&D and capex absolutely figure into de-facto profitability and sustainability for AI labs, despite their separate treatment in accounting.

> well most of us here on HN have benefited from decades of overinflated engineering salaries being paid by often companies that were not profitable and not only unprofitable

This is a really concerning perspective: people were paid what they were worth. Software is or was one of the few remaining arenas wherein a person can find a middle or upper middle class lifestyle consistently.

I will also note: a startup raising an 8 MM series A and eventually fizzling out is not the same at the hundreds of billions invested in these AI companies without a path to profitability. It is utterly absurd to pretend these are the same thing: any company ingesting that much cash needs to justify its capacity to survive.

▲

fcarraldo a day ago | parent | next [-]

> Software is or was one of the few remaining arenas wherein a person can find a consistently

Software salary inflation and expansion has made this the case. Tech’s accessibility to the educated has accelerated gentrification massively, rising up prices on rent and food. While the statement is correct, tech’s contribution to income inequality is part of the issue. If you’ve lived in Austin or Chicago (especially Austin) prior to ~2010 you’ll have seen this first hand.

▲

reverius42 a day ago | parent [-]

I don't think there are enough well-paid tech workers to affect things like the (national) market for food. Local rent markets are at least partly explained by this; I agree that the $3M houses near Palo Alto, CA are because of Big Tech salaries, but not the price of ground beef.

	▲	fcarraldo 20 hours ago \| parent [-]
		Not at the nationally owned chain grocery store, but at local establishments it’s certainly an issue and prices out longtime residents who don’t work in this industry. Rent prices push everything up in the local market. Housing rents impact business rents in an area, as well as what the business’ owners and employees need to maintain their own lifestyles. People who live there now can pay more and will, so prices go up. But the local shop owner isn’t getting rich, they’re still struggling as everything around them rises.

▲

ForHackernews a day ago | parent | prev | next [-]

> any company ingesting that much cash needs to justify its capacity to survive.

What, why? There are tons of low-margin capex-intensive business out there.

I think AI will end up like being like hosting. All the models will converge to being pretty-decent and the companies will have to compete on efficiency since they are selling a generic commodity.

You can already see Anthropic fears this scenario since they try so hard to make people use their first-party tools rather than plugging Claude in as a generic part of a third-party stack.

LLM hosting is the next VPS.

	▲	nadav_tal a day ago \| parent [-]
		[flagged]

▲

guzfip a day ago | parent | prev | next [-]

> Software is or was one of the few remaining arenas wherein a person can find a consistently.

I want to add something additional to this: it is one of the few fields that can afford middle or upper middle class lifestyle and is accessible.

I have no doubt if I could redo my life with the necessary resources I’d be more than capable of putting myself through med school and gone with a secure career that paid more than I ever made in software.

But at this stage of life? I don’t have the time or money to spend a decade+ paying some institution tens of thousands of dollars to hopefully maybe have a real career.

Once software as a career dies, I suspect many will find themselves locked out the middle class for generations.

▲

WarmWash a day ago | parent | next [-]

It was kind of a flash in the pan moment where you could leave your retail floor manager job, crash course this thing called "javascript" in a 3 month class, and then get hired for a six figure remote job if you could choke out a mildly competent github repo.

	▲	infecto a day ago \| parent [-]
		Exactly. I don’t know why folks take so much offense to it. You could absolutely do just as you describe. Spend 3-4 hours truly working while enjoying the lunch today sessions and in-house lunch and barista. I definitely benefited from this and I am not ashamed of it but absolutely it was this weird moment in time.

▲

9rx a day ago | parent | prev [-]

> I suspect many will find themselves locked out the middle class for generations.

On the other hand, once software as a high paying career dies there will be nothing to prop up the status quo (high cost of housing, for example) so the middle class will return to being much more accessible to modestly paid jobs.

▲

infecto a day ago | parent | prev | next [-]

Oh come on there are no “classic HN mistakes” here. Inference is profitable but bottom line is not yet. This is a very young industry and unlike those of the past, it’s much easier to picture a possibility of profitability. It’s absolutely different in that the marginal cost scales linear but solving for the R&D portion of a product where supply cannot keep up is a lot easier than some SaaS where the underlying product is not being used.

The salary jab was probably a little harsh.

Your ending is a bit of a fizzle too. There are many capex intense businesses that do just fine.

▲

keybored a day ago | parent | prev | next [-]

> This is a really concerning perspective: people were paid what they were worth.

Even interpreting what-they-were-worth in the usual sense, I’m not so sure about this. We have seen wage collusion reported by the usual US West Coast-based companies. And some news on here[1] have reported that some engineer with a salary of $100K[2] might be producing $1M of value. And even factoring in the usual “but benefits and overhead” comes out to a solid factor of profit per programmer/engineer.

Despite that the sense I get (only from this site since that is my only reference) is that the so-called overpaid engineers are incredibly content to just have this happen to them. As long as they are paid well compared to other workers, it’s fine. No matter the profit factor. In fact, the discourse is very much focused on how “privileged” they were if the tide ever changes. Instead of realizing how much value they provided, collectively.

Outlets for capturing more of the value they create is entrepreneurship (Hello HN). Never any collective organizing. And entrepenurship is easily bought via aqcuisition.

Collective bargaining would have been relevant in case they ever get automated... by the very software they co-created.

One could imagine that this “privileged” collection of programmers could have served as a vanguard for the collective good of programming professionals as well as collective ownership of software goods, using their privilege to that end. The former never happened, and the latter is partly realized in people’s free time (see the OSS maintainer in Nebraska meme).[3]

[1] All from recollection since this is just news from the Frontier to me

[2] Of course the pay might be much higher now; this might have been a while ago

[3] when it isn’t simply exploited by corporations just using OSS without giving any back; a logical turn of events when no license or law forces them to contribute back

▲

guzfip a day ago | parent | next [-]

> As long as they are paid well compared to other workers, it’s fine.

Well I’m sure they’ll be thrilled to know they can collect $100 a week more in unemployment benefits than their neighbor.

▲

keybored a day ago | parent [-]

I wasn’t alluding to them resigning or whatever this comment is referring to.

▲

reverius42 a day ago | parent [-]

I don't think you get to collect any unemployment if you resign.

▲

keybored 2 hours ago | parent [-]

Really.

▲

reverius42 2 hours ago | parent [-]

Yes, really. At least in the USA, you only get to collect unemployment if you are laid off -- if you are fired for cause, or leave your job of your own volition, you are not eligible.

	▲	keybored an hour ago \| parent [-]
		Do you have a point that you want to make?

▲

a day ago | parent | prev [-]

[deleted]

▲

9rx a day ago | parent | prev [-]

> This is a really concerning perspective: people were paid what they were worth.

The parent comment doesn't discount that, only pointing out that "what they were worth" was inflated due to a speculative environment. Wherein lies your concern?

▲

lotsofpulp a day ago | parent | next [-]

That prices change from one point in time to another is a trivial fact.

“Inflated due to a speculative environment” is not an accurate way to frame labor prices that held for many years. At that point, the prices were simply high due to high demand relative to supply (compared to other types of labor).

▲

9rx a day ago | parent [-]

> At that point, the prices were simply high due to high demand relative to supply

That goes without saying. The investigation here is into demand. Which was said to be overinflated due to speculation. As noted, many of the companies hiring the developers did not have viable businesses.

	▲	a day ago \| parent [-]
		[deleted]

▲

SirensOfTitan a day ago | parent | prev [-]

I think calling it inflated is to play to a narrative that labor was overvalued broadly in tech.

Salaries across industries in the US have remained flat since the 1970s. Calling the one sector that can provide access a middle class lifestyle inflated s to play into a narrative capital is eager to tell, even if OP didn't intend that.

▲

9rx a day ago | parent [-]

> Salaries across industries in the US have remained flat since the 1970s

What do you mean? The real (meaning adjusted for inflation) hourly wage in the US has increased by around 20% since 1970.

What has changed since the 1970s is that wages are no longer coupled to productivity. Perhaps that is what you are thinking of? But that should be an obvious truism for anyone in tech. We create the very things that cause that to be the case!

▲

keybored a day ago | parent [-]

> We create the very things that cause that to be the case!

What happened in the 1970’s was the NeoLiberal shift and wasn’t caused by software.

▲

win311fwg a day ago | parent [-]

That NeoLiberal shift did not take place in a vacuum. It was a product of the world around it. It absolutely was caused by tech.

If we — those with the power to build the productivity creators — took a stand and said "we refuse to create tech for the interests of the few" it would have never happened. But, instead, we welcomed it and are responsible for it.

▲

keybored a day ago | parent [-]

The corollary of “if we took a stand” is that Capital took a stand and collectively undid a lot of the gains of the post-WWII social democratic order.

So no. It wasn’t caused by tech beyond the uninteresting factors like modern society being complex and, of course, that tech developments influence things (pretty much all things).

▲

win311fwg a day ago | parent [-]

The productivity gains we've seen above the capacity of human productivity would have been impossible without tech. It absolutely was caused by tech.

The benefactor of those gains was also entirely decided by those who created the tech. We could have given use of that tech to everyone. In some cases we actually did (e.g. open source), but in most cases we gained (at least partial) ownership of the capital so it was in our best personal economic interest to keep it for ourselves and our close friends.

▲

keybored a day ago | parent [-]

> The productivity gains we've seen above the capacity of human productivity would have been impossible without tech. It absolutely was caused by tech.

Would have been impossible without and being caused by are different things.

The sense of being “caused by” in a political context are the people who have the power to direct things. Which are not necessarily the people who implement something.

> The benefactor of those gains was also entirely decided by those who created the tech.

You assert that they were decided by. Based on what?

The vast majority of tech work was done in employment, either for some government or for private entities. The private entitites were controlled by Capital. The governments were controlled by democratic forces and Capital.

> We could have given use of that tech to everyone. In some cases we actually did (e.g. open source),

Again I reference the meme of Overworked Nebraska OSS Maintainer.

The impressive work done on OSS by tech workers directly have been done in their free time. The bulk of OSS work done by people as a living is probably through corporations like e.g. Intel working on the Linux Kernel.[1]

That impressive free time work has gotten the reputation as a treasure trove for the highly motivated and tech literate. In contrast to something that regular people can plug-and-play as an alternative to Big Tech dominance.

> , but in most cases we gained (at least partial) ownership of the capital so it was in our best personal economic interest to keep it for ourselves and our close friends.

Yes, well played. For those that got away with their financial-independence millions. For the rest, well, I guess they never managed to learn the moral lesson of Monopoly.

[1] Or am I wrong here? I could be off-base.

▲

win311fwg a day ago | parent [-]

> in a political context

While you are right to recognize that there was some attempt to inject political context, it was not there originally, and is not the main discussion taking place. The fact that wages and productivity have become decoupled is not inherently political. It is but simple mathematics. Tech is the cause for the decoupling; it is why we have been able to become continually more productive and at an accelerating rate.

> The vast majority of tech work was done in employment

Yes, but generally even where employment is present tech workers also demand a share in ownership (e.g. stock). Tech doesn't invent itself. At least not yet. The workers have held the cards. Even those who haven't won the lottery are still in a pretty good economic position, relatively speaking.

	▲	keybored 4 hours ago \| parent [-]
		> While you are right to recognize that there was some attempt to inject political context, it was not there originally, and is not the main discussion taking place. I don’t care if anyone wants political context to be there or not. Political context is not some subjective choice that the participants in a discussion can choose to be the case or not, like some alternative history exercise. This political context (i.e. reality) called NeoLiberalism is so well-researched and argued that I can just call it NeoLiberalism and even a forum full of techheads don’t bat an eye. Which is more than can be said for your incoherent nuh-uh where both: - Technology just determines things by itself - And (also) the rank and file peons who implement technology could have forced something better on the world (than the pile of shit that we have)

▲ a day ago | parent | prev | next [-]

[deleted]

▲ thereitgoes456 a day ago | parent | prev | next [-]

> The cost to serve tokens is absolutely profitable

Can you explain why you know better than the analyst at Cursor cited in this article?

▲

iterateoften a day ago | parent | next [-]

Open router is an upper bound of compute cost for the open source models. So people assume that opus and sonnet really isn’t sucking up 10x the resources because open source models aren’t 10x worse. Idk if it’s true or not, but haiku is $5/m tokens and it is much worse than the $2-3/mt models imo

	▲	vbarrielle a day ago \| parent [-]
		Openrouter is a startup, what's the indication it serves token at a profit? It could be serving them at a loss to show growth.

▲

infecto a day ago | parent | prev | next [-]

Can you cite your source of an analyst at Cursor. I read the article and looking through the boatload of links but struggled to find what you are referring to. Ty

▲

noelsusman a day ago | parent | prev [-]

That analyst was talking about subsidizing tokens through the subscription plans, which is a different claim.

	▲	infecto a day ago \| parent [-]
		Ty for sharing and agree. I think there is confusion with some folks in the comments for this post confusing inference profitability and plan profitability. Most plans as we can tell are probably teetering the line of profitability and that’s why we have seen some like Cursor really tighten how many tokens you get.

▲ Eridrus a day ago | parent | prev | next [-]

The article is just helpfully illustrating how artisanal you can make your slop if you really try!

▲ nickphx a day ago | parent | prev [-]

step change? how? profitable? where did you read that? people want tokens? really? who are these people?

	▲	elzbardico a day ago \| parent [-]
		Yeah, if we just ignore R&D, fixed costs, depreciation, and the fact that there's a high likelyhood investor were expecting a return, yeah, ignoring all of that, and trusting their number we may say inference turns a profit. In accounting, almost anything you want can be true, at least for some time.