LLMs exist on a logaritmhic performance/cost frontier. It's not really clear whether Opus 4.5+ represent a level shift on this frontier or just inhabits place on that curve which delivers higher performance, but at rapidly diminishing returns to inference cost.

To me, it is hard to reject this hypothesis today. The fact that Anthropic is rapidly trying to increase price may betray the fact that their recent lead is at the cost of dramatically higher operating costs. Their gross margins in this past quarter will be an important data point on this.

I think the tendency for graphs of model assessment to display the log of cost/tokens on the x axis (i.e. Artificial Analysis' site) has obscured this dynamic.

▲

louiereederson 7 hours ago | parent | next [-]

I meant reference Toby Ord's work here. I think his framing of the performance/cost frontier hasn't gotten enough attention https://www.tobyord.com/writing/hourly-costs-for-ai-agents

▲

dang 2 hours ago | parent | next [-]

Let's give that one a SCP* re-up: https://news.ycombinator.com/item?id=47778922

(* explained at https://news.ycombinator.com/item?id=26998308)

▲

fragmede 5 hours ago | parent | prev [-]

That post doesn't address the human factor of cost, and I don't mean that in a good way. Even if AI costs more than a human, it's tireless, doesn't need holidays, is never going to have to go to HR for sexual harassment issues, won't show up hungover or need an advance to pay for a dying relative's surgery. It can be turned on and off with the flip of a switch. Hire 30 today, fire 25 of them next week. Spin another 5 up just before the trade show demo needs to go out and fire them with no remorse afterwards.

▲

lbreakjai 3 hours ago | parent | next [-]

The cost to hire a human is highly predictable. The cost of AI isn't. I, as a human, need food and shelter, which puts a ceiling to my bargaining power. I can't withdraw my labour indefinitely.

The power dynamics are also vastly against me. I represent a fraction of my employer's labour, but my employer represents 100% of my income.

That dynamic is totally inverted with AI. You are a rounding error on their revenue sheet, they have a monopoly on your work throughput. How do you budget an workforce that could turn 20% more expensive overnight?

	▲	bornfreddy 3 hours ago \| parent \| next [-]
		By continuously testing competitors and local LLMs? The reason for rising prices is that they (Anthropic) probably realized that they have reached a ceiling of what LLMs are capable of, and while it's a lot, it is still not a big moat and it's definitely not intelligence.
	▲	alex_sf an hour ago \| parent \| prev \| next [-]
		The same way companies already deal with any cost.
	▲	zer00eyz 3 hours ago \| parent \| prev [-]
		> The cost of AI isn't. This is why there are a ton of corps running the open source models in house... Known costs, known performance, upgrade as you see fit. The consumer backlash against 4o was noted by a few orgs, and they saw the writing on the wall... they didnt want to develop against a platform built on quicksand (see openweb, apps on Facebook and a host of other examples). There are people out there making smart AI business decisions, to have control over performance and costs.

▲

piker 4 hours ago | parent | prev | next [-]

That was a great promise before the models starting becoming "moody" due to their proprietors arbitrarily modifying their performance capabilities and defaults without transparency or recourse.

	▲	mh- 41 minutes ago \| parent [-]
		I still haven't seen any statistically sound data supporting that this is happening on the API (per-token pricing.) If you've got something to share I'd love to see it.

▲

louiereederson 4 hours ago | parent | prev | next [-]

I think it's difficult to say agentic and human developer labor are fungible in the real world at this point. Agents may succeed in discrete tasks, like those in a benchmark assessment, but those requiring a larger context window (i.e. working in brownfield systems, which is arguably the bulk of development work) favor developers for now. Not to mention that at this point a lot of necessary context is not encoded in an enterprise system, but lives in people's heads.

I'd also flip your framing on its head. One of the advantages of human labor over agents is accountability. Someone needs to own the work at the end of the day, and the incentive alignment is stronger for humans given that there is a real cost to being fired.

▲

kennywinker 4 hours ago | parent [-]

For some the appeal of agent over human is the lack of accountability. “Agent, find me ten targets in iran to blow up” - “Okay, great idea! This military strike isn’t just innovative - it’s game changing! A reddit comment from ten years ago says that military often uses schools to hide weapons, so here is a list of the ten most crowded schools in Iran”

	▲	Our_Benefactors 3 hours ago \| parent [-]
		It must be wild to actually go through life believing the things written in this post and also thinking you have a rational worldview.

▲

michaelbuckbee 3 hours ago | parent | prev | next [-]

More importantly it collapses mythical-man-month communication overhead.

▲

pona-a 4 hours ago | parent | prev | next [-]

I think the word you're looking for is contractors. But yes, you still have to treat those with _some_ human decency.

▲

krainboltgreene 4 hours ago | parent | prev | next [-]

Ah-ha, the perfect slave.

▲

cyanydeez 4 hours ago | parent | prev [-]

it just will delete production database when flustered. no biggie. we learning how to socialize again. cant let all that history go to waste.

	▲	cindyllm 4 hours ago \| parent [-]
		[dead]

▲

Aurornis 6 hours ago | parent | prev | next [-]

> It's not really clear whether Opus 4.5+ represent a level shift on this frontier or just inhabits place on that curve which delivers higher performance, but at rapidly diminishing returns to inference cost.

I think we're reaching the point where more developers need to start right-sizing the model and effort level to the task. It was easy to get comfortable with using the best model at the highest setting for everything for a while, but as the models continue to scale and reasoning token budgets grow, that's no longer a safe default unless you have unlimited budgets.

I welcome the idea of having multiple points on this curve that I can choose from. depending on the task. I'd welcome an option to have an even larger model that I could pull out for complex and important tasks, even if I had to let it run for 60 minutes in the background and made my entire 5-hour token quota disappear in one question.

I know not everyone wants this mental overhead, though. I predict we'll see more attempts at smart routing to different models depending on the task, along with the predictable complaints from everyone when the results are less than predictable.

▲

KronisLV 3 hours ago | parent | next [-]

> It was easy to get comfortable with using the best model at the highest setting for everything for a while, but as the models continue to scale and reasoning token budgets grow, that's no longer a safe default unless you have unlimited budgets.

For a while I used Cerebras Code for 50 USD a month with them running a GLM model and giving you millions of tokens per day. It did a lot of heavy lifting in a software migration I was doing at the time (and made it DOABLE in the first place), BUT there were about 10 different places where the migration got fucked up and had to manually be fixed - files left over after refactoring (what's worse, duplicated ones basically), some constants and routes that are dead code, some development pages that weren't removed when they were superseded by others and so on.

I would say that Claude Code with throwing Opus at most problems (and it using Sonnet or Haiku for sub-agents for simple and well specified tasks) is actually way better, simply because it fucks things up less often and review iterations at least catch when things are going wrong like that. Worse models (and pretty much every one that I can afford to launch locally, even ones that need around ~80 GB of VRAM in the context of an org wanting to self-host stuff) will be confidently wrong and place time bombs in your codebases that you won't even be aware of if you don't pay enough attention to everything - even when the task was rote bullshit that any model worth its salt should have resolved with 0 issues.

My fear is that models that would let me truly be as productive as I want with any degree of confidence might be Mythos tier and the economics of that just wouldn't work out.

	▲	Aurornis an hour ago \| parent [-]
		Good points. I was speaking from a position of using an LLM in a pair programming style where I'm interactive with each request. For handing work off to an LLM in large chunks, picking the best model available is the only way to go right now.

▲

dustingetz 4 hours ago | parent | prev | next [-]

Human dev labor cost is still the high pole in the tent, even multiplying today's subsidized subscription cost by 10x. If the capability improvement trajectory continues, developers should prepare for a new economy where more productivity is achieved by fewer devs by shifting substantial labor budget to AI.

	▲	johnmaguire 2 hours ago \| parent \| next [-]
		I'm getting a lot more done by handing off the code writing parts of my tasks to many agents running simultaneously. But my attention still has its limits.
	▲	what an hour ago \| parent \| prev [-]
		Your employer doesn’t pay the subscription cost, they pay per token. So it’s already way more than 10x the cost.

▲

richstokes 3 hours ago | parent | prev | next [-]

The problem is half the time you don't know you need the better model until the lesser model has made a massive mess. Then you have to do it again on the good model, wasting money. The "auto" modes don't seem to do a good job at picking a model IME.

	▲	2 hours ago \| parent [-]
		[deleted]

▲

dahart 4 hours ago | parent | prev | next [-]

> I know not everyone wants this mental overhead, though.

I’m curious how to even do it. I have no idea how to choose which model to use in advance of a given task, regardless of the mental overhead.

And unless you can predict perfectly what you need, there’s going to be some overuse due to choosing the wrong model and having to redo some work with a better model, I assume?

▲

Leynos 4 hours ago | parent | prev | next [-]

Isn't that essentially GPT Pro Extended Thinking?

▲

jpalawaga 5 hours ago | parent | prev | next [-]

Except developers can’t even do that. Estimation of any not-small task that hasn’t been done before is essentially a random guess.

▲

nilkn 5 hours ago | parent | next [-]

I don't completely agree. Estimation is nontrivial, but not necessarily a random guess. Teams of human engineers have been doing this for decades -- not always with great success, but better than random. Deciding whether to put an intern or your best staff engineer on a problem is a challenge known to any engineering manager and TPM.

	▲	jpalawaga an hour ago \| parent [-]
		or tech lead. or whoever. the point is, someone has to do the sizing. I think applying an underpowered agent to a task of unknown size is about as good as getting the intern to do it. Even EMs and TPMs are assigning people based on their previous experience, which generally boils down to "i've seen this task before and I know what's involved," "this task is small, and I know what's involved," or "this task is too big and needs to be understood better."

▲

justapassenger 4 hours ago | parent | prev [-]

That's why you split tasks and do project management 101.

That's how things worked pre-AI, and old problems are new problems again.

When you run any bigger project, you have senior folks who tackle hardest parts of it, experienced folks who can churn out massive amounts of code, junior folks who target smaller/simpler/better scoped problems, etc.

We don't default to tell the most senior engineer "you solve all of those problems". But they're often involved in evaluation/scoping down/breakdown of problem/supervising/correcting/etc.

There's tons of analogies and decades of industry experience to apply here.

	▲	jpalawaga an hour ago \| parent [-]
		Yeah... you split tasks into consecutively smaller tasks until it's estimateable. I'm not saying that can't be done, but taking a large task that hasn't been broken down needs, you guessed it, a powerful agent. that's your senior engineer who can figure out the rote parts, the medium parts, and the thorny parts. the goal isn't to have an engineer do that. we should still be throwing powerful agents at a problem, they should just be delegating the work more efficiently. throwing either an engineer or an agent at any unexplored work means you just have to delegate the most experienced resource to, or suffer the consequences.

▲

KaiShips 5 hours ago | parent | prev [-]

[dead]

▲

snek_case 7 hours ago | parent | prev | next [-]

They're also getting closer to IPO and have a growing user base. They can't justify losing a very large number of billions of other people's money in their IPO prospectus.

So there's a push for them to increase revenue per user, which brings us closer to the real cost of running these models.

▲

giwook 7 hours ago | parent | next [-]

I agree, and I'm also quite skeptical that Anthropic will be able to remain true to its initial, noble mission statement of acting for the global good once they IPO.

At that point you are beholden to your shareholders and no longer can eschew profit in favor of ethics.

Unfortunately, I think this is the beginning of the end of Anthropic and Modei being a company and CEO you could actually get behind and believe that they were trying to do "the right thing".

It will become an increasingly more cutthroat competition between Anthropic and OpenAI (and perhaps Google eventually if they can close the gap between their frontier models and Claude/GPT) to win market share and revenue.

Perhaps Amodei will eventually leave Anthropic too and start yet another AI startup because of Anthropic's seemingly inevitable prioritization of profit over safety.

▲

snek_case 7 hours ago | parent | next [-]

I think the pivot to profit over good has been happening for a long time. See Dario hyping and salivating over all programming jobs disappearing in N months. He doesn't care at all if it's true or not. In fact he's in a terrible position to even understand if this is possible or not (probably hasn't coded for 10+ years). He's just in the business of selling tokens.

▲

bombcar 6 hours ago | parent [-]

And worse, he (eventually) has to sell tokens above cost - which may have so much "baggage" (read: debt to pay Nvidia) that it'll be nearly impossible; or a new company will come to play with the latest and greatest hardware and undercut them.

Just how if Boeing was able to release a supersonic plane that was also twice as efficient tomorrow; it'd destroy any airline that was deep in debt for its current "now worthless" planes.

	▲	outofpaper 5 hours ago \| parent [-]
		That's why open models are going to win in the long run.

▲

sumedh 2 hours ago | parent | prev | next [-]

> At that point you are beholden to your shareholders

No not really, you can issue two types of shares, the company founders can control a type of shares which has more voting power while other shareholders can get a different type of shares with less voting power.

Facebook, Google has something similar.

	▲	what an hour ago \| parent [-]
		No, they still have to act in the interest of shareholders even if they have no voting power.

▲

devmor 7 hours ago | parent | prev [-]

Skeptical is a light way to put it. It is essentially a forgone conclusion that once a company IPOs, any veil that they might be working for the global good is entirely lifted.

A publicly traded company is legally obligated to go against the global good.

▲

mattkevan 7 hours ago | parent | next [-]

It’s not really, companies like GM used to boast about how well they treated their employees and communities. It was Jack Welch and a legion of like-minded arseholes who decided they should be increasingly richer no matter who or what paid for it.

▲

axpy906 3 hours ago | parent | next [-]

It’s funny how corporations get a bar wrap. Have you ever worked with private equity? Bad to worse.

▲

dboreham 6 hours ago | parent | prev | next [-]

See also HP. Pretty much only Costco left.

	▲	chrisweekly 4 hours ago \| parent \| next [-]
		This is where PBCs (Public Benefit Companies) and B-Corps may have a role to play. Something like that seems necessary to enable both (A) sufficient profitability to support innovation and viability in a capitalist society and (B) consideration of the public good. Traditional public companies aren't just disincentivized from caring about externalities, they're legally required to maximize shareholder profits, full stop. Which IMHO is a big part of the reason companies ~always become "evil".
	▲	devmor an hour ago \| parent \| prev [-]
		Costco is such a strange and stark case standing in opposition to this general rule. From everything I hear, I can only gather that the reason is because of extremely experienced and level-headed executive staff.

▲

tehjoker 4 hours ago | parent | prev | next [-]

The previous deal was due to (a) a lower level of development of capitalism (b) a higher profit margin that collapsed in the 70s (c) a communist movement that threatened capital into behaving

	▲	ShroudedNight 3 hours ago \| parent [-]
		"Is your washroom breeding Bolsheviks?"

▲

renticulous 5 hours ago | parent | prev [-]

Middle class productive population produces commons goods and resources which gets exploited by Elites. Tragedy of the Commons applied to wealth generation process itself.

▲

giwook 7 hours ago | parent | prev | next [-]

Fair point.

Call me an optimist, but I'm still holding out hope that Amodei is and still can do the right thing. That hope is fading fast though.

▲

thibauts 6 hours ago | parent [-]

« Don’t be evil »

	▲	abirch 5 hours ago \| parent [-]
		If no one can buy your soul, what's its value? Every Management Consulting Firm

▲

WarmWash 7 hours ago | parent | prev [-]

The problem is that people equate money to power and power to evil.

So no matter what, if you do something lots of people like (and hence compensate you for), you will be evil.

It's a very interesting quirk of human intuition.

▲

arcanemachiner 7 hours ago | parent | next [-]

A reasonable conclusion, considering that money and power seem to have their own gravity, so people with more of both end up getting even more of both, and vice versa.

Can't blame someone who comes to such a conclusion about money and power.

▲

WarmWash 5 hours ago | parent [-]

The unreasonable part automatically labeling power as evil.

▲

epsilonic 5 hours ago | parent | next [-]

It’s a sane default to label power as evil in a society driven by greed, usury, and capital gain. Power tends to corrupt, particularly when the incentives driving its pursuit or sustenance undermine scruples or conscientiousness. It is difficult to see how power is not corrupting when it becomes an end in itself, rather than a means directed toward a worthy or noble purpose.

▲

ModernMech 5 hours ago | parent | prev [-]

Labeling power evil is not automatic, its just making an observation of the common case. Money-backed power almost never works for the forces of good, and the people who claim they're gonna be good almost always end up being evil when they're rich and powerful enough. See also: Google.

▲

WarmWash 4 hours ago | parent [-]

Google is the company that created a class-less non-hierarchical internet. Everyone can get the same access to the same services regardless of wealth or personhood. Google is probably the most progressive company to ever exist, because money stops no one from being able to leverage google's products. Born in the bush of the Congo or high rise of Manhatten, you are granted the same google account with the same services. The cost of entry is just to be a human, one of the most sacrosanct pillars of progressive ideology.

Yet here they are, often considered on of the most evil companies on Earth. That's the interesting quirk.

	▲	ModernMech 4 hours ago \| parent \| next [-]
		Lot of people and companies were responsible for that. Anyway, that says nothing about what Google has become.
	▲	devmor 4 hours ago \| parent \| prev [-]
		> Google is the company that created a class-less non-hierarchical internet. Can you explain what you mean by this? I disagree but I don't understand how you think Google did this so I am very curious. For my part, I started using the internet before Google, and I strongly hold the opinion that Google's greatest contribution to the internet was utterly destroying its peer to peer, free, open exchange model by being the largest proponent of centralizing and corporatizing the web.

▲

tehjoker 4 hours ago | parent | prev [-]

Money and power are good when used democratically to clearly benefit the majority of the people. They are bad otherwise. It is hard to see this because we live in such a regime that exists in the negative space seemingly without beginning or end. Other countries have different relationships to their population.

▲

ljm 6 hours ago | parent | prev | next [-]

They're also getting into cloud compute given you can use the desktop app to work in a temporary sandbox that they provision for you.

I was about to call it reselling but so many startups with their fingers in the tech startup pie offer containerised cloud compute akin to a loss leader. Harking back to the old days of buying clock time on a mainframe except you're getting it for free for a while.

▲

zozbot234 6 hours ago | parent | prev [-]

The "real cost" of running near-SOTA models is not a secret: you can run local models on your own infrastructure. When you do, you quickly find out that typical agentic coding incurs outsized costs by literal orders of magnitude compared to the simple Q&A chat most people use AI for. All tokens are very much not created equal, and the typical coding token (large model, large noisy context) costs a lot even under best-case caching scenarios.

▲

iainmerrick 3 hours ago | parent | prev | next [-]

That sounds very plausible. But it implies they could offer even higher performance models at much higher costs if they chose to; and presumably they would if there were customers willing to pay. Is that the case? Surely there are a decent number of customers who’d be willing to pay more, much more, to get the very best LLMs possible.

Like, Apple computers are already quite pricey -- $1000 or $2000 or so for a decent one. But you can spec up one that’s a bit better (not really that much better) and they’ll charge you $10K, $20K, $30K. Some customers want that and many are willing to pay for it.

Is there an equivalent ultra-high-end LLM you can have if you’re willing to pay? Or does it not exist because it would cost too much to train?

▲

criemen 3 hours ago | parent [-]

> Is there an equivalent ultra-high-end LLM you can have if you’re willing to pay? Or does it not exist because it would cost too much to train?

I guess at the time that was GPT-4.5. I don't think people used it a lot because it was crazy expensive, and not that much better than the rest of the crop.

	▲	foobar10000 40 minutes ago \| parent [-]
		The issue is not better - it’s better _AND_ fast enough. An agentic loop is essentially [think,verify] in a loop - i.e. [t1,v1,t2,v2,t3,v3,…] A model that does [t1,t2,t3,t4] in 40 minutes, if verify takes 10 min, will most likely do MUCH worse that a model that does t1 (decently worse) in 10 mins, v1 in 10 mins, t2 now based on t1 and v1 in 10 mins, v2 in 10 mins, etc.. So, for agentic workflows - ones where the model gets feedback from tools, etc…, fast enough is important.

▲

conductr 2 hours ago | parent | prev | next [-]

Yeah. Combine this with much of Corpos right now using a “burn as many tokens as you need” policy on AI, the incentive is there for them to raise price and find an equilibrium point or at least reduce the bleed.

▲

amelius 2 hours ago | parent | prev | next [-]

Once they implement their models directly in silicon, the cost will come down and the speed will go up. See Taalas.

	▲	aaronblohowiak an hour ago \| parent [-]
		taalas is amazing. id gladly spend 5-15k on something that matched that performance with opus 4.6 quality

▲

ethin 6 hours ago | parent | prev | next [-]

I mean, the signs have been there that the costs to run and operate these models wasn't as simple as inference costs. And the signs were there (and, arguably, are still there) that it costs way, way more than many people like to claim on the part of Anthropic. So to me this price hike is not at all surprising. It was going to come eventually, and I suspect it's nowhere near over. It wouldn't surprise me if in 2-3 years the "max" plan is $800 or $2000 even.

▲

ezst 5 hours ago | parent [-]

> It wouldn't surprise me if in 2-3 years the "max" plan is $800 or $2000 even.

I'd rather be surprised if they are still doing business by then.

▲

QuiEgo 5 hours ago | parent [-]

I would not be surprised at all, a $1,000/mo tool that makes your $20,000/mo engineer a lot more productive is an easy sell.

I’m guessing we’re gonna have a world like working on cars - most people won’t have expensive tools (ex a full hydraulic lift) for personal stuff, they are gonna have to make do with lesser tools.

▲

selfmodruntime 2 hours ago | parent | next [-]

No engineer will cost 20.000 bucks a month at this point in time. Offshoring is still happening aggressively.

▲

cyanydeez 4 hours ago | parent | prev [-]

noway.

i bought a $3k AMD395+ under the Sam Altman price hike and its got a local model that readily accomplishes medial tasks.

theres a ceiling to these price hikes because open weights will keep popping up as competitors tey to advertise their wares.

sure, we POV different capabilities but theres definitely not that much cash in propfietary models for their indererminance

	▲	benjiro3000 3 hours ago \| parent [-]
		[dead]

▲

Lihh27 4 hours ago | parent | prev | next [-]

heh adaptive thinking is letting the meter run itself. they make more when it runs longer.

▲

orangecar 27 minutes ago | parent | prev | next [-]

[dead]

▲

paulddraper 7 hours ago | parent | prev [-]

> The fact that Anthropic is rapidly trying to increase price may betray the fact that their recent lead is at the cost of dramatically higher operating costs.

Or they are just not willing to burn obscene levels of capital like OpenAI.

	▲	6 hours ago \| parent [-]
		[deleted]