Remix.run Logo
everforward 9 hours ago

There’s no way to justify their valuations if they get downgraded to a pair programming tool. They need fully agentic stuff to work and replace human engineers to even come close.

Offhand, I’m not even certain whether a model like that could justify the constant retraining we’re doing on the agentic models.

It doesn’t make a lot of sense to spend millions or billions on training to reduce hallucinations by 0.3% if your model assumes a human is in the loop to course-correct them.

keeda 8 hours ago | parent | next [-]

Some napkin math -- total global labor compensation is about 50% of the GDP, which puts it in the USD 50 - 60 Trillion range: https://ourworldindata.org/grapher/labor-share-of-gdp

This source claims that knowledge workers alone (probably because they are paid much more) account for 35 - 50 Trillion of that: https://github.com/danielmiessler/Substrate/blob/main/Data/K...

If LLMs can boost their productivity even by an average of 5% (studies from ~2024 put it in the ~30% range depending on task) that is ~1.5 - 2.5T in value annually. Even if the AI industry can capture a fraction of that, that is a huuuge monetization opportunity.

Note, at 5% productivity boost, humans are not just in the loop, they are the loop. AGI or large-scale replacement of humans is not even needed, but the financial opportunity is already immense, and it scales with how much human productivity can be improved (i.e. how much work can be offloaded to LLMs.)

Now, I don't think AGI will happen soon (or has already happened, depending on how you define it) but I do think humans will be a much smaller part of the loop and large-scale job displacement will happen once companies figure out how to properly use AI.

At this point, the financial upside for the AI industry is extremely high but will be limited by the social turmoil that will inevitably ensue (which we're already seeing brewing in the data center backlash.)

e9 8 hours ago | parent | next [-]

I want to propose alternative reality where 1.5-2.5T in value doesn't go to a handful of companies. Instead it turns out to be like restaurants where this gets distributed to lots and lots of small, local, mostly interchangeable teams. There will of course be some super star "chefs" leading the industry and setting trends and some "restaurant chain" like big businesses and supply chain for all of this.

keeda 5 hours ago | parent | next [-]

FWIW I do think that availability of competitive open weight and other non-frontier models, along with improvements in harnesses that can get good results out of these models, will result in less concentration and a healthier marketplace.

However, these frontier labs are also making moves that could let them capture a disproportionate share of the upside. One possibility is a situation analogous to the smartphone manufacturing space, where there are dozens of players but just a handful (e.g. Apple, Samsung in smartphones) capture the lion's share of the revenue.

skeptic_ai 3 hours ago | parent [-]

Apple you can’t exit the ecosystem.

Samsung the same. And is the best android device.

If tomorrow comes a Nokia os will be dead in the water: it has no apps.

But with a new llm that doesn’t matter. There is nothing sticky about typing Gemini, Claude or codex in a cli.

keeda an hour ago | parent [-]

There's nothing sticky today but you can bet they're working maniacally to fix that. These companies will make most of their money in the enterprise space and there are probably unlimited ways to engineer stickiness in an enterprise setting. Like, MSFT still rakes in those billions despite pretty much every one of their products having commodity competitors.

The AI labs are also making moves to secure long-term enterprise presence, such as their Forward Deployed Engineer strategy. I think that is a trojan horse play that could make enterprises dependent on them forever, much like so many companies are still dependent on IBM's mainframes. As an extreme example, you could imagine a company's core business logic encoded in the weights of a proprietary model custom-trained and hosted by one of these model providers, something even more inscrutable and sticky than ancient COBOL codebases.

xxpor 7 hours ago | parent | prev | next [-]

The world is not zero sum. Value is created, not just preserved. Anthropic and OpenAI creating value does not imply that smaller guys can not also create value.

afavour 7 hours ago | parent [-]

But marketplaces also exist and big players in a marketplace are often able to manipulate the market such that they are advantaged and small players are not able to break in.

mpyne 6 hours ago | parent [-]

This is true of every market that has ever existed, and that's not stopped small players from finding niches.

bdamm 7 hours ago | parent | prev | next [-]

How? Training and operating models seems to naturally focus on those willing to invest quite significantly in these operations.

nish__ 6 hours ago | parent [-]

If RAM prices come down, running your own models will be relatively affordable.

actionfromafar 7 hours ago | parent | prev [-]

Sysco is pretty big.

ricardobayes 7 hours ago | parent | prev | next [-]

I am deeply surprised by the silence of philosophers, sociologists, liberal arts majors, economists. Where are the think tanks who contemplate and debate the societal aspects? The tech is advancing full steam but the "other side" doesn't feel anywhere nearly ready.

bloppe 7 hours ago | parent | next [-]

Idk why you're perceiving silence. Feels to me like this is the main thing people talk about nowadays.

scarmig 7 hours ago | parent [-]

It has to do with the scope of what they're discussing. It seems extraordinarily small: e.g. what if AI increases productivity growth by 0.4%? Do data centers use too much water? Are AIs racist when reviewing resumes?

The frontier labs, on the other hand, are thinking about replacing all human labor, ending death, and the risk of it causing human extinction. Most of the apparatus we're talking about approach it very parochially; it's almost like they're embarrassed to take the grander ideas even a little seriously, for being too nerdy/sci-fi.

freejazz 6 hours ago | parent [-]

The public would happily string up any of these CEOs if given the chance

bdamm 7 hours ago | parent | prev | next [-]

Because the "other side" is busy trying to anthropomorphise AI into solving the trolly problem, while being mostly clueless about the actual problems.

They'll show up after the fact and whinge endlessly about how they should have been involved.

DrewADesign 4 hours ago | parent [-]

I guess the real problems are things like people not being allowed to post AI-generated images in digital drawing, painting, and photography communities, because I see a lot of boosters ceaselessly whining about that abject “discrimination”, despite having plenty of places where people post all kinds of that garbage all the time.

Or maybe every cultural group has its own set of whiners and we always think the ones we disagree with are the loudest.

digitaltrees 7 hours ago | parent | prev | next [-]

Reid Blackmun has written several books and has a consultanting agency to guide ethical implementation of AI

freejazz 6 hours ago | parent | prev | next [-]

Silence? Even the pope has come out against AI? Who hasn't? Diplo??

DrewADesign 4 hours ago | parent | prev [-]

Sometimes the great algorithmic gods give us a glimpse of our own bubble.

cindyllm 4 hours ago | parent [-]

[dead]

everforward 6 hours ago | parent | prev | next [-]

> Note, at 5% productivity boost, humans are not just in the loop, they are the loop. AGI or large-scale replacement of humans is not even needed, but the financial opportunity is already immense, and it scales with how much human productivity can be improved (i.e. how much work can be offloaded to LLMs.)

The studies I've seen recently (at least in the software space) put it at something like a 10% increase in coding speed, which for me would probably translate to something like a 3% increase in productivity. I spend a lot more time on things like getting agreement between teams, documenting approaches to things that don't exist on the wiki, etc, that LLMs are significantly less effective at. Or just can't do; no one will be happy if I send an LLM instead of me to meetings.

I suspect a lot of roles are like that. They give a 10-30% boost to the core role function, but that core role is still only 30-50% of what you do.

> that is ~1.5 - 2.5T in value annually

That seems really large, but it's ~2-3x Walmart's yearly revenue, and OpenAI and Anthropic both have estimated valuations that compare to Walmart's market cap. And this is before we consider that they need to do it for cheaper or why would anyone bother. Realistically, potential revenue is probably half that at best.

It's also before cutthroat pricing really kicks in. People are willing to pay for Claude right now; I still suspect that as time goes on people will start looking towards Deepseek/GLM/etc models that provide 95% of the performance at 10% of the price. That'll cut the market even further.

The question is how much demand for knowledge work swells as prices fall, and whether that's a soft landing or a crash.

keeda 3 hours ago | parent [-]

> That seems really large, but it's ~2-3x Walmart's yearly revenue, and OpenAI and Anthropic both have estimated valuations that compare to Walmart's market cap. ...

It's also before cutthroat pricing really kicks in.

Right, that's more of an estimate on the value proposition of the overall AI industry, rather than valuations of the industry or specific players. While I don't think OpenAI and Anthropic will capture all of the potential upside, I do suspect they will do much better than other players despite the competition (https://news.ycombinator.com/item?id=48740472)

> And this is before we consider that they need to do it for cheaper or why would anyone bother.

Typically yes, but there are reasons companies may be willing to pay the same amount or even more, such as "AI doesn't need sleep, holidays, insurance, or benefits" and "AI is easier to procure and replace than humans."

> The studies I've seen recently (at least in the software space) put it at something like a 10% increase in coding speed...

Curious to see which studies you're looking at, the studies I'm thinking of (some here: https://news.ycombinator.com/item?id=45379452) are from 2024 - 2025, so already old and before agents really took off.

However, your point about meetings and agreements and documenting is much more germane. My theory is that the largest productivity gains -- and subsequent labor displacement -- will come from reducing coordination overhead: https://news.ycombinator.com/item?id=48040999

danenania 7 hours ago | parent | prev | next [-]

I’d also point out that LLM inference revenue already totals more than 100B annually based on publicly reported numbers. Almost none of that is replacing knowledge workers. Almost all is increasing their productivity. So empirically what you describe is already happening to a nontrivial degree.

hedora 7 hours ago | parent | prev | next [-]

You’re trying to apply value based pricing (infinite margin upside) to a commodity.

Pre-bubble pricing: $1400 gets a 128GiB iGPU optimized for inference. Glm and kimi need 800-1000GiB. Call it 1TiB. The $1400 boxes could be ganged into sets of 4-8, with a switch. Call the switch $1000.

Each box has a TDP of 250W. 8 x 250/120V = 16.666A, or one household circuit in the US, so no new power infrastructure is needed.

$1400 x 8+1000=$12,200. Assuming standard five year depreciation, that’s $2440 a year. There are a billion knowledge workers alive today. So that’s $2.4T annual revenue. Average net profit margins on computer hardware are 4.3%. That works out to $105B net income, globally.

So, I guess the question is whether the (currently #2) open weight models provide $1.4-2.4T less value per year than the #1 and #3 models, and, if so, if customers can measure this, or are willing to spend 2x more and deal with censorship, data theft, intentional enshitification, sabotage, ads, product placement, etc, to get the slightly “better” model.

Also, note that my numbers assume moore’s law stopped for all time in 2024, but we’ve seen HW improvements since then.

keeda 2 hours ago | parent [-]

Right, that number is more of an estimate of the value proposition of the entire AI industry rather than projections of revenue or valuations... it's essentially an estimate on how much the market could theoretically bear. Whether the companies can capture that value is, to your point, rightly a different question.

I do think open weight and other competitor models, especially with better harnesses, will play a significant role in the equation and will result in less concentration in the market. However, I do also think the big AI companies will capture a lot of that value. Partially for the same reasons that the cloud industry has been growing like gangbusters, even pre-AI, despite on-prem being much cheaper: companies will outsource anything that is not deemed a "core competency" for their business.

A lot of the problems you mentioned will be relegated to the consumer market and won't apply to enterprise contracts -- which is where the real money is.

parineum 7 hours ago | parent | prev | next [-]

> If LLMs can boost their productivity even by an average of 5% (studies from ~2024 put it in the ~30% range depending on task) that is ~1.5 - 2.5T in value

Minus the cost of inference, that might not be the boon you're making it out to be. I hear what people around here are spending on their api and I'm skeptical that these tools are making me that much more productive.

Personally, for assisted development, I haven't seen much progress in a while.

4rf 3 hours ago | parent | prev [-]

What a load of nonsense lmao.

Pls stop posting you are creating noise.

overgard 9 hours ago | parent | prev | next [-]

That's a really good point. I think if there wasn't the insane amount of money involved and these were treated as tools instead, they would probably be MORE productive. I think a person working hand in hand with an AI instead of delegating is the sweet spot of making things fast while also not losing understanding or control of the system. You are absolutely right that these companies can't justify their valuations if they do that though. I just got a new mac to run models locally, and so far the results have been positive with some small hiccups. I'm thinking the future of this tech will likely be better tooling with better IDE integrations rather than "Claude plz make me a SaaS kthx"

everforward 6 hours ago | parent | next [-]

> I'm thinking the future of this tech will likely be better tooling with better IDE integrations rather than "Claude plz make me a SaaS kthx"

I think this sort of thinking is a trap, because it presumes that all software has the same constraints.

There's a spectrum of requirements between "chuck this over the wall at Claude, it only has to work once" and "this is a literal rocket ship, formally verify the whole thing".

I've made some things with Claude I don't understand and don't control. It's fine, they're still useful to me. Things for the house that I wasn't going to build manually, some dashboarding stuff and scripts for work, stuff that can crash and burn and I'll be fine.

They won't justify trillions in investment, but they are useful.

Equally, I do agree with you on some things. Sometimes I hand-hold the LLM or forgo it entirely because I want to be 100% sure I know how something works, and can justify a decision if it causes a production outage.

I think the future is probably multiple different tools with different goals. Better IDE integration for some uses, an entirely separate "LLM herd controller" kind of thing for when you're okay with vibe-coding, and the most interesting is something in the middle where you're more in the loop than pure vibe-coding, but don't see the full context like in an IDE. Something where it surfaces changes to key components, but hides things like test changes.

balder1991 4 hours ago | parent [-]

It’s what’s called in software engineering as “casual software” as a differentiator of “business software” and “critical software”. Not all types needs a high bar of quality, and most of the software engineering thought practices are tailored for business applications that will be made available to multiple users.

As you said, building a script that only you use personally or a very simple thing that just accomplishes one task and it’s easy to test require almost no engineering, and an LLM can often build those with very little downsides.

ah1508 6 hours ago | parent | prev | next [-]

> while also not losing understanding

That's a key point. Keeping knowledge and know how inside the company is strategic. For most people GPS did not result in better sense of direction, spellchecking did not help to write without making mistakes, and delegating translation to deepl does help to be better in a foreign languages. I don't see the gain for an individual, a company, a society if a technology reduces the ability to think, do stuff, understand complex problem, working hard at something. Hiring junior also matters, what is boring for a senior dev is useful for a junior, like the "wax on wax off" in Karatekid. Then when the senior dev retired the junior is not junior anymore and the know how is still here. I want to to transfer my knowledge to a junior, not to anthropic or google or openai.

Ideally, working hand in hand with an AI could be like driving a motorcycle vs riding a bicycle. Both are fine, but you go much faster with a motorcycle and you don't lose any ability. But prompting a motorcycle auto-pilot by voice sound a bit stupid and boring. Insane use of energy rarely comes into the equation, which is a bit weird. Personally it is why I am never tempted to use AI. However I see value in AI for finding weakness in a code (inverse of flattery), writing tests with all the edge cases based on specs since tests are often sloppy, asking a fresh view on a very difficult problem. I'd love to hear about the equivalent of move#32 in game 2 (AlphaGo vs Lee Sedol) in a difficult programming task. But I think that massive delegation of code writing is how you lose the knowledge and the know how: what keeps us sharp.

Final word: I asked once a review to claude, the codes involved a db transaction. Nothing complicated, Claude said everything was fine. However the transaction isolation level was not set (I did it on purpose, like if I did not know about isolation levels). He did not ask me if it was my intention to keep the default level. I would have preferred a challenging feedback: why did you chose the default isolation level ? Is it on purpose ? Do you know that the default depend on the db ? Do you know about isolation ? Tell me about the business use case and I'll explain which one would be the best.

user43928 7 hours ago | parent | prev [-]

I am thinking the opposite. I've been having great results with handing more and more responsibilities to the agent.

Contrary to what some people suggest, I have not hit any maintenance or reliability dead ends. If something breaks, the agent fixes it.

If it cannot, I have the agent instrument the code and work through the logs to check hypotheses, until the source of the issue is found.

If even that would fail, which did not yet happen, I can still do some old fashioned digging and learning, like I always have.

This is for native mobile app development, and the code base is around 100k LOC.

tskj 8 hours ago | parent | prev | next [-]

Dario has publicly claimed each model has been profitable, even accounting for its training costs; it's just that each new model is exponentially more expensive to train than the last, so the income lags and it looks like the company is losing money overall.

Now, we can't know if this is true unfortunately, but it's not directly contradicted by anything that's known publicly at least. I thought it was an interesting way to frame it and makes the whole situation look marginally less bad.

NorwegianDude 6 hours ago | parent | next [-]

A common extreme misconception is that inference is expensive and that providers are loosing a lot of money. Inference is extremely lucrative and profitable.

drob518 4 hours ago | parent [-]

Inference is the phase where they make money. But the question is whether they can be profitable overall as training continues to balloon.

4rf 3 hours ago | parent | prev [-]

why are you listening to these idiots who have every incentive to spin the story as much as possible

FCFF = EBIT(1-t)-Reinvestment

I dont care about your gross profit - this kind of cash profit determines the value of operating assets.

sanderjd 9 hours ago | parent | prev | next [-]

My two cents is that the way to square this circle is that the valuations should be lower and they should be spending a lot less on constant retraining.

Unfortunately (from my perspective) it seems like the US companies are increasingly stuck in their current model. I think it's a competitive disadvantage.

But obviously most of the real insiders seem to disagree with me, so I'm probably wrong :)

wyre 8 hours ago | parent [-]

The insiders disagree because they are benefiting greatly from the insane valuations, right?

Chinese models are quickly commodifying frontier inference, the US Gov is preventing domestic SOTA models access to the public and without those models why would consumers still spend $200/month to use the best models?

It’s such a mess and isn’t inspiring confidence as a non-investor.

sanderjd 8 hours ago | parent [-]

Are they benefiting from the insane valuations though? If the valuations deflate before the insiders are able to exit, I think that would be worse for them than a lower but sustainable valuation.

It all comes down to whose prediction of the future is closer to correct. I think the most likely future is commodification of inference and "agent-assisted" rather than "agent-driven" workflows dominating the future of work. But insiders - who both know way more than me, and also have more skin in the game, both for better and worse - seem to really think I'm wrong about that.

So I dunno! Could go either way!

drob518 3 hours ago | parent | next [-]

It’s all about timing. This is tech bubble 2.0, Dotcom Boogaloo. If you’re able to flip it quickly, you’ll have generational wealth. If not, you could be holding a lot of worthless paper.

sanderjd an hour ago | parent [-]

Yes.

But is your impression that this is the strategy of people like Amodei? My impression is that it isn't, that they are actually true believers, and not just trying to hit the timing right and flip it.

wyre 6 hours ago | parent | prev [-]

Even if the future is agent-driven workflow, that doesn't stop the commodification of inference. a good agent-driven workflow, in my experience, is a byproduct of the harness and scaffolding around the agent.

What insiders are you talking about? They're going to be hot towards the possibilities so they can exit to a massive windfall. I dont know why they would want to be publicly critical of these technologies that could make millions on IPO.

sanderjd 6 hours ago | parent [-]

I'm talking about people who work at the frontier labs who talk to the press, and what seems to be the revealed beliefs of those same people from the strategies we see their companies pursuing.

My point is that actually it would be worse for these people if the valuations are only high during this period - which will last awhile longer from now! - where their equity is not liquid, but crashes as the market figures out this commoditization thing.

But if we're wrong about how that's going to go, then this isn't a concern because there won't be any devaluation. And to me that seems to be what they honestly think is going to happen. And they know more than me (and I think they're a lot smarter than me), so this does temper my confidence in my own predictions.

ricardobayes 8 hours ago | parent | prev | next [-]

At some point it's going to plateau, maybe already has. Then they will switch to FPGA/ASIC-based model-specific hardware for lower consumption. I'm pretty sure the "space data centers" won't use GPUs, they are not radiation-tolerant whereas FPGAs can be.

https://www.cerebras.ai/blog/gemma-4-on-cerebras-the-fastest...

quaverquaver 7 hours ago | parent [-]

I would not take "space data centers" as a given! from most to least likely these will be vaporware, vaprorized-ware, rubble-ware, loss leaders.

7 hours ago | parent [-]
[deleted]
JumpCrisscross 9 hours ago | parent | prev | next [-]

> no way to justify their valuations if they get downgraded to a pair programming tool

I think there is. Pair today doesn’t mean they’re locked into that forever.

4rf 3 hours ago | parent | next [-]

you always post about valuations but never share your own.

go ahead m8 we are all waiting... the stage is yours. lets see your model.

ChrisLTD 6 hours ago | parent | prev [-]

Their valuations don't make sense as just programming tools, period. Forget about if they are still human driven.

EddieRingle 7 hours ago | parent | prev [-]

> There’s no way to justify their valuations if they get downgraded to a pair programming tool.

Honestly I still don't see how they justify their valuations, period. If anything they're serious liabilities.

Open-weight models are improving and reaching "good enough" levels for more and more tasks. They're also known quantities; you know what you're getting with them and don't have to worry about the model silently (or not so silently) being switched out from under you (whether that's because Anthropic/OpenAI decides you're not worthy of their latest and greatest for one reason or another, or they switch you to a quantized model to save on compute, or they simply sunset the specific model you've been relying on).

And if the open-weight model doesn't run on your local hardware already, there are any number of hosting providers that will handle that for you (so you're back to just paying for colocation/cloud usage instead of nebulous tokens).

Closed models are improving as well, sure, but diminishing returns will eventually kick in (as they already have for various tasks, as I said).

So if not their models, where does their value come from? Just simple network effects/lock-in? "Normal" users will drift to other options if they start showing more and more ads, and enterprise customers will surely be looking for opportunities to avoid lock-in and reduce risk.

I think the last argument I've heard is that these valuations are basically a bet that Anthropic and/or OpenAI will achieve AGI that can fully replace human labor, so they'll essentially be able to sell that replacement labor to everyone. They haven't managed to pull that off, yet, however. Businesses that have tried to replace humans almost immediately realized either that the AI's capabilities were oversold or that they at least needed a human in the loop still, to some degree. And even if they do achieve AGI, that would surely become an issue of national security (they're already flirting with that today), so who's to say governments won't simply nationalize the best AI labs and either remove them from the economy entirely or perhaps even provide models as a public service to level the playing field?

That all sounds like a giant gamble, if anything. And it's incredibly frustrating to watch as someone that's been unemployed for a year because (a) budgets are being burned on tokens and (b) LLM-generated applications are flooding hiring teams and preventing real people from being seen. (Not to mention, as someone that spends a lot of time in gaming circles, the fact that DRAM and flash storage is quickly becoming inaccessible is just an additional frustration that means people can't even find temporary relief in entertainment.) I can only hope this bubble finally implodes before I lose my house.

35 minutes ago | parent | next [-]
[deleted]
pixl97 6 hours ago | parent | prev [-]

>Open-weight models are ...

<banned>

Not the first one to come up with that likely outcome either. I mean, if you're being restricted from SOTA models now, how long do you expect before the FBI kicks in your door for using an 'illegal' open model?