Remix.run Logo
dkobia 7 hours ago

Zitron is begging for a collapse at this point. Yes, his macro analysis correctly identifies a massive financial risk but his incessant pessimism completely misses the incredible ground-level utility that many of us on HN celebrate every day through undeniable, massive productivity gains.

At this point I'm trying to believe there's a middle ground where the level of individual capability this unlocks, leads to major discoveries.

toasty228 5 hours ago | parent | next [-]

> undeniable, massive productivity gains.

Take any stock index, remove AI stocks, what do you see? That's right! Nothing...

So where is all the productivity going? Where is the value? Where are the massive unemployment stats or the millions of new startups making big $$$?

moritzwarhier 5 hours ago | parent | next [-]

Writing about AI, destroying the planet for data centers, there's a lot of money to be made.

That being said, AI seems kind of miraculous sometimes.

Similar to cars. So enticing that we make everything else in the world worse in order to maximize the profit, make it indispensable, subsidize it, and make the dependency on it irreversible.

And it's not even something to blame individual people for.

Driving away from all the other cars to spend a weekend feels like freedom.

Using AI to answer a question feels like a "bicycle for the mind".

But in fact it's more like a car. It requires massive resources and creates perverse incentives, and the result is ineffective and corrupt.

Both cars and AI are amazing technology and extremely useful, but using them is not an individual responsibility. It requires societal subsidy.

nfw2 5 hours ago | parent | next [-]

The environmental impact of answering a question on an obscure topic with ai model is less than an the impact of answering the question with an hour-long google search hunting for references or a drive to the public library.

moritzwarhier 4 hours ago | parent | next [-]

That's true, and I am not anti-AI. I was not only thinking about the environmental effects of some single prompt or a certain amount of tokens.

Neither did I want to say that a car is always more wasteful than some alternative.

But defaulting to the behemoth is inefficient, unless everyone is driven to do it: then it's in some way reasonable.

By adding "corrupt" and "dependent", as well as the economic terms, I wanted to offer a broader critique and create an analogy, not just talk about energy usage on its own.

What I had in mind was: it's easier to go many places that are a mile or less from me, by car. Because everything is obstructed by cars. And I'm atrophied by lack of movement. Best would be to drive somewhere to move/walk.

People already do that in masses.

And doing shopping by car, because everything else seems unbearable, also takes away your time, apart from wasting energy compared to more, smaller shops that would be reachable by foot, bycicle etc.

I guess you know the argument.

Today, people's thinking atrophies because their LLM is probably right in their summarization of some Wikipedia article, plus 2-3 other random sources.

Or so.

Using the Wikipedia search function is not expensive.

But, I mostly had a bigger picture in mind than what is the cost of inference.

nfw2 4 hours ago | parent | next [-]

I think it's a good analogy in many ways, and personally I think car-centric society has a lot of flaws. I think the ease that AI brings to tasks may erode mental capabilities in the same way cars have eroded our collective physical health.* That said, it doesn't seem to me that we would be better off without cars altogether, despite all the related issues.

I am concerned about the environmental impacts that AI poses, but they don't seem to me to be so catastrophic. Solar and battery tech has made enormous leaps in the past couple decades, and we will need to pivot to clean energy future irrespective of AI.

*This said, I have become gradually more alarmed over the past decade at the lack of epistemological rigor in the general public, as made apparent through the rise of social media. I don't know that AI becoming a truth-seeking crutch for people wouldn't be more good than bad.

moritzwarhier 3 hours ago | parent [-]

> it doesn't seem to me that we would be better off without cars altogether, despite all the related issues

Oh my god, no. I also want the benefits of automobiles! They are strictly more capable than, say, trains. That's where I would derail the discussion completely when going into details, but no, I am not against cars as a technology.

Apart from all the ethical and social arguments (logistics, ambulances, the elderly, etc etc). But that's not where I wanted to go.

I was making a leap here simply because of the whole complex around prisoner's, dilemma, the commons, state economy, and so forth.

Since at least ~100yrs ago, I guess cars and streets as the primary mode of transportation have also "won the vote" / are what the majority wants, so it's also an interesting analogy for diminishing returns maybe.

Building out more car infrastructure is certainly not controversial where there is absolutely none but there are commercial or residential buildings.

Anyway, lots of associations are worth considering here IMO. The ultimate limiting capacity here, when disregarding all environmental or health concerns, is simply space and the positive externalities (cities etc) around existing infrastructure.

RajT88 4 hours ago | parent | prev [-]

> I was not only thinking about the environmental effects of some single prompt or a certain amount of tokens.

Hand wringing about AI datacenter's environmental impact is well and good. We should keep the data centers accountable for their consumption and waste.

I just wish the same people had been upset the last 20 years with poor water resource management in a lot of areas (the west US especially) with urban, ranching and farming development.

> That's true, and I am not anti-AI.

Me neither!

toasty228 4 hours ago | parent | prev | next [-]

It's like saying if we didn't have cheap commercial flights people would travel by foot anyways and would consume more resources for food &co. than the plane would consume in fuel...

80% of generative AI queries wouldn't even exist as google searches.

nfw2 4 hours ago | parent | next [-]

To be clear, your position here is that insurmountable barriers to information is the preferable state of the world?

One claim of the parent comment was that AI is ineffective. For the purpose of finding answers to questions, it is more resource-efficient than the alternatives, and, to your point, capable of answering questions that were impossible to answer via other means before. In what way is that ineffective?

16bitvoid 32 minutes ago | parent | next [-]

No, they're saying that 80% of genai queries (aka anything sent to an LLM; I won't speak on the validity of the percentage) are not things someone would search on Google. It's things like trial-and-error vibecoding, openclaw-like agentic loops, talking to chatgpt like it's a person, etc. In other words, most genai queries are not for getting "obscure information" or even getting direct information at all. It's about either getting it to do something you don't want to do yourself, or using it as a replacement for someone else (junior dev, therapist, friend, significant other, etc).

3 hours ago | parent | prev [-]
[deleted]
moritzwarhier 4 hours ago | parent | prev [-]

I do plenty of AI queries, both pragmatic ones and some for entertainment: witnessing talktotransformer was mind-bending already at the time! And since then, I've tried frontier models, local, coding agents, and use plenty of them on the regular.

I awe at the capabilites of generative AI.

I also enjoy sitting in or driving a car.

I did not want to make a moral argument, unless you consider each and every form of utilitarianism as moralism.

lxgr 4 hours ago | parent | prev [-]

That might be true, but at least I started asking way more questions since we’ve had competent LLMs.

MSFT_Edging 5 hours ago | parent | prev | next [-]

Vonnegut said in his last living work that the greatest addiction modern people face is the drug of cheap oil.

We got addicted to the convenience and overuse, and have started a mass extinction event because of it.

The perverse incentives will come for us all.

moritzwarhier 2 hours ago | parent [-]

It is exactly this thought, in the form of this sentence, that could replace almost all of my comments in this thread.

It feels depressing, but I think the same. When thinking about the larger world, it becomes increasingly hard to ignore. And of course it is not new.

There were "doomers" already in the midst of the 20th century, but it doesn't mean that they were wrong.

bloomca 4 hours ago | parent | prev [-]

I agree with your message but not sure about the conclusion. Cars themselves are commodified luxury available (in the US pretty much required) to everyone, and they do need to be subsidized, both in terms of infrastructure and the lifestyle they require.

But with AI what is the exact price? My understanding is that R&D is extremely expensive, but running non-SOTA models is not that bad. We are getting pretty close to models which can be useful locally in many applications.

Or do you mean that at scale running them locally is not possible and hence the infrastructure price is in data centers, which will be expensive to maintain and scale for demand?

moritzwarhier 4 hours ago | parent [-]

Thanks for asking an open question about my point.

First, because I initially failed to answer your more closed questions (this paragraph is edited in):

> We are getting pretty close to models which can be useful locally in many applications. Or do you mean that at scale running them locally is not possible and hence the infrastructure price is in data centers, which will be expensive to maintain and scale for demand?

I don't think there's a way around making the best of AI capabilities with minimum price and maximum control, and I'd agree this is met by on-prem data centers, just not in a rationally targeted way.

Back to my original comment:

Because it (my conclusion) was not so clear, and maybe I just wanted to highlight some observations without delivering a real argument for or against things [, I thank you for your open question].

The utility/leverage aspect for AI seems more esoteric than the one for cars because, apart from Chatbots, it's more hidden.

And also, similar to cars (or many other phenomena of industrialization), yes, my first vague point was the subsidization of infrastructure. But also, the power gap: that's something not only associated with AI or cars, but with a lot of technologies we all hold dear: sewage, powerline, logistics, etc etc.

What reminds me of cars in the current AI frenzy is the fixation on cementing infrastructure. And also, I think, a lot more people agree on, for example, some kind of universal right to, for example, clean water.

But all of industrialization confronts people with questions of efficiency, inequality, and collective support.

Most people would, for example, support a right to get a minimum amount of clean water when you are living and working in a tradionally inhabited space (if you're on the social-darwinist side) or at least not harming society (if you're more of a social democrat).

And, similar to the buildup of car infrastructure, and the procurement of resources, space etc for maximum building, giant data centers can obstruct people in buying drinking water. Or walking outside (AI obstructs traditional methods of online collaboration).

nilkn 4 hours ago | parent | prev | next [-]

The original point of the stock market was to fund gigantic society-level projects (like railroads). Modern VC has replaced some of that at smaller scales but not all of it at the largest scales. So this could just be the stock market performing the function it was designed to perform -- helping fund something transformative on a societal level.

onlyrealcuzzo 4 hours ago | parent | prev | next [-]

> Take any stock index, remove AI stocks, what do you see? That's right! Nothing...

Where did all the stock gains go before AI?

FAANG / MAG-7.

Was everything from 2012-2020 fake, too?

toasty228 4 hours ago | parent [-]

They went from ~9% of the sp500 to ~35% over your timeframe...

4 hours ago | parent | prev | next [-]
[deleted]
atleastoptimal 4 hours ago | parent | prev [-]

Not sure what your point is. Stock markets are based on money going into securities based on estimated future value. Even if AI were doubling productivity at a non-AI company, there is more leverage to that money going into an AI company.

The question is, is AI leading to massive productivity gains in companies that implement it? AI productivity gains take time to diffuse, but so far companies in the S&P 500 are seeing very high growth. YOY earnings growth rate for the S&P 500 is 21.7% https://advantage.factset.com/hubfs/Website/Resources%20Sect...

toasty228 4 hours ago | parent [-]

> YOY earnings growth rate for the S&P 500 is 21.7%

Now remove the companies selling the AI shovels: https://pbs.twimg.com/media/HIAjbZxacAARHwD.png

> Not sure what your point is.

My point is that they're selling us Skynet and the end of employment as we now it, things that we shouldn't even have to measure to perceive the results of, yet no one is able to measure any of it

Pointing a finger at nvidia, google, and the other few companies stuck in circular investment schemes that shouldn't even be legal and saying "OOGA BOOGA line go UP, UP GOOD!" doesn't count in my book

Jtarii 24 minutes ago | parent | next [-]

Charitably the lag time for this technology to have noticeable effects could just be ~5 years away. Similarly to how computers didn't have a big impact for a decade after they were introduced as people got used to using them.

atleastoptimal 4 hours ago | parent | prev [-]

Is the image you provided depicting revenue, or stock value? My point is about revenue.

toasty228 4 hours ago | parent [-]

Revenues don't matter when you sell a dollar for 50ct and half of the deals are circular anyways

atleastoptimal 3 hours ago | parent [-]

So you're claiming that the revenue growth of the S&P 500 over the last few years is largely due to "selling dollars for 50ct" and circular deals?

toasty228 2 hours ago | parent [-]

Yes.

https://insights.som.yale.edu/insights/this-is-how-the-ai-bu...

> AI-related stocks have accounted for 75% of S&P 500 returns, 80% of earnings growth and 90% of capital spending growth since ChatGPT launched in November 2022.

atleastoptimal 42 minutes ago | parent [-]

has it occurred to you that AI companies may be making huge returns because AI is genuinely increasing productivity and driving actual economic growth via their products?

If all these false practices can pull revenue out of nothing, why doesn’t every company do it? How come AI companies seem to be able to pull off financial magic that no other company can match?

All your analyses still ignore the revenue point.

spmurrayzzz 6 hours ago | parent | prev | next [-]

He has also consistently demonstrated, at least to me, that he doesn't really understand how inference works from a technical perspective, which weakens much of his core thesis for why there should be a collapse.

I do value having some naysayers in the mix generally, because we do need balanced critique in what is otherwise a very frothy hype cycle. I just don't think he's making sound arguments, and that's even assuming you even agree with his premises in the first place.

My biggest gripe with his napkin math is that he treats inference gross margins as something novel that you can't compare to normal SaaS margins. He's right in part: the constant carousel of R&D costs from model training, related infrastructure buildout, and other adjacent costs required to stay competitive do change the analysis a bit.

But he takes this way too far when he says this is structurally different from normal SaaS margins. The business model definitely doesn't look like Dropbox, but it absolutely looks a lot like AWS, especially early AWS, CDNs, telecom, etc. I can speak to the telecom bit personally, since it's been over half of my professional career as an engineer and, in this specific case, also as a founder. You can have a brutally capital-intensive infra business where profitability depends on utilization, oversubscription, peak-capacity planning, segmentation, and recovering capex over time.

The math he presents gets even more questionable as we see explicit segmentation happening for cost-saving reasons. Many forward-thinking orgs are waking up to the fact that they don't need to use the best, most expensive model for every task. They can route easier tasks to cheaper models, use caching, batch non-urgent workloads, and reserve frontier models for the subset of work that actually needs frontier intelligence. That directly undermines his claim that providers always need to chase frontier intelligence in order to maintain current demand, utilization, and pricing curves.

pluto_modadic 5 hours ago | parent | next [-]

I think he doesn't need to understand the technology to point out the books are cooked. a business can sink in either way: the technology flops or the finances flop. he's arguing the /finances/ would flop. he doesn't argue that the /technology/ would flop, only that they can't come up with the money to pay their debters.

spmurrayzzz 5 hours ago | parent [-]

There is a piece of this I agree with. That you do not need to be a deep technical expert to notice that a company is burning cash by overcommitting to capex, or relying on heroic revenue projections that may or may not come to pass.

But that is not the full argument he is making. If the claim is that the labs will not be able to pay their creditors because inference is structurally incapable of becoming profitable, then he absolutely needs to be right about the technical economics of inference.

One part of that is the balance-sheet argument (which already shows insanely good margins). But it also depends on how inference-time compute actually works: routing, batching, kv cache reuse, model segmentation, different latency tiers, etc. Much of those details he's just been straight up wrong about in his writing, so as a result I have to call into question the rest of his reasoning as well (in part to avoid Gell-Mann amnesia).

dofm 3 hours ago | parent | prev | next [-]

> That directly undermines his claim that providers always need to chase frontier intelligence in order to maintain current demand, utilization, and pricing curves.

But does it also not mean that they will make less money given that there is already brutal competition for that lower tier from openrouter, Deepseek, Amazon, etc.?

You can't on the one hand say "customers are beginning to understand they can spend less" and on the other hand suggest that this is good for forecasts of revenue.

solomatov 6 hours ago | parent | prev [-]

> that he doesn't really understand how inference works from a technical perspective

Could you share what tells about it? I.e. where he was wrong about it?

spmurrayzzz 5 hours ago | parent [-]

There's examples both in his writing and also in his appearances on podcasts, interviews, etc.

I'll cherry pick a couple:

“When these new models ‘reason,’ they break a user’s input and break into component parts, then run inference on each one of those parts.” [1]

This is not at all how test-time compute works. At best, this is a very loose metaphor that he may have used out of convenience. This might sound a bit pedantic to point out, but this is a very basic thing that he's getting wrong (presumably at least, again it could be that he just used a poor metaphor).

A less pedantic example would be his claims related to gpt-5/chatgpt auto-routing. He argued that having a router means OpenAI can no longer cache static prompts, because the user prompt has to come before the hidden instructions [2]. This is just not at all how this works at inference-time. There is no evidence that the standard approach of system>developer>user instruction hierarchy has changed, the public API and caching docs maintain this.

But even more broadly, it suggests he is reasoning about kv/prefix caching at the wrong level of abstraction. It's true that conventional prefix caching does require a stable prefix, so yes, if you literally put variable user content before the static prompt, you would destroy the cacheability of that static prompt.

But that is exactly why inference systems are designed to preserve reusable prefixes where possible (via checkpointing or similar), and why serving systems care so much about prefix caching. This is also a big part of how disaggregated prefill/decode infra works where cache-aware routing is critical. His argument treats a bad prompt layout as if it were a necessary consequence of routing, rather than an avoidable implementation choice.

A router can read the user request, decide which model path to use, and then construct a normal downstream model call with stable static instructions first and user content later. Treating that as impossible implies a fundamental architectural misunderstanding.

[1] https://www.wheresyoured.at/how-to-argue-with-an-ai-booster/

[2] https://www.wheresyoured.at/how-does-gpt-5-work/

oudlys 7 hours ago | parent | prev | next [-]

Productivity is not value. It's quite possible for you to experience productivity improvements, and actual value to not be created. That is what I think the most robust data is showing.

https://unessays.substack.com/p/talk-is-cheap

amatheus 5 hours ago | parent | next [-]

From an economic perspective productivity is defined as the creation of value isn't it? Then if you "improve productivity" and does not create value in the end you're no improving productivity at all.

oudlys 5 hours ago | parent | next [-]

It does depend on how you define productivity. But the way it's commonly used is "I'm going faster, personally, with these tools."

The thing people I think have a hard time seeing is that "I go faster" does not mean "more features get finished".

It's a scale issue, and one scale is better than the other. People only pay for finished features, they do not pay for how much code you emit.

fl4regun 3 hours ago | parent [-]

economists define productivity as gdp per hour worked. Like a lot of other economic measurements, its mostly a bogus number people use as an argument on why their politics are better than someone elses politics. You can have an efficient business located in a poor country making the same product and same quality as that same business in a rich country, the rich country will be more "productive" because local cost of goods is higher there (i.e. a restaurant in NYC is more "productive" than a restaurant in bangladesh).

oudlys 3 hours ago | parent [-]

Sure. But that's not, in my view, how most people use the word productivity when describing LLM use.

In my field - operations - productivity is usually described as some rate of production for a specific asset. 100 widgets / machine / hour - for example.

"My productivity is 3 PRs / day with the LLM as opposed to 1 PR per every three days". That's how I think people are thinking about it.

My point is that's not the same thing as value. I.e. what people will pay for.

fl4regun 2 hours ago | parent [-]

You're correct, I just wanted to add that there is another definition that you may see used online, and it is very specific, and it's important to be aware it's NOT exactly the same thing most normal people mean when they say "productivity".

w29UiIm2Xz 4 hours ago | parent | prev [-]

Productivity is defined revenue per worker hour. And we know worker hours are going down as there are fewer workers with the layoffs.

bigstrat2003 7 hours ago | parent | prev | next [-]

Also, supposed productivity gains are dubious. I personally experience at best no productivity gains when using LLMs to write code, and sometimes it's an active drain on my productivity. There was that one study a year or so ago showing similar results. People are trying to say the productivity gains are there and undeniable, but that is not true. It is very much a subject of controversy whether AI helps productivity.

asdfasgasdgasdg 7 hours ago | parent [-]

I can see an argument that the productivity gains are illusory / don’t translate to economic productivity. I’m not denying the possibility.

However, most of the engineers I respect have gone from being skeptics a year ago to convinced today. I don’t personally know any true holdouts any more. If there are studies that disprove productivity gains more than six months ago, I’m happy to believe that it was true of the AIs that were available at the time. But I’m going to need something much more recent before I disbelieve my lyin’ eyes where it pertains to the AIs available today.

oudlys 6 hours ago | parent | next [-]

There is an observational study that was published in March 2026 that followed 4000 teams over 2 years. It shows, in my view, exactly that the productivity gains don't translate into economic value.

Here is the report:

https://www.faros.ai/blog/ai-acceleration-whiplash-takeaways

And my commentary:

https://unessays.substack.com/p/talk-is-cheap

asdfasgasdgasdg 3 hours ago | parent [-]

If it was published in March 2026, even if the data was collected up to the day the study was published, 7/8ths of it would fail my “within the last six months” test. But I am looking forward to the results of future studies on this topic!

oudlys 2 hours ago | parent [-]

I get wanting to wait for more data. And thinking that LLMs have improved enough that this will change.

My view is that it's not really about how good the models are - it's about how we're using them. Understanding what you've built is an important part of value creation, and LLMs eliminate that.

dminik 5 hours ago | parent | prev | next [-]

Its funny, I've noticed the same thing, but did not come to the same conclusion.

I currently don't have work access to Claude Code, but most of my teammates do. Watching from the outside, the cycle seems to look like this:

1. Experience some success, which hooks you into relying on AI.

2. The AI keeps failing at some task, but you don't want to stop. Keep trying over and over again.

3. Run out of tokens and take a break.

Now, sometimes 1 doesn't happen. Sometimes 2 doesn't happen. 3 is a certainty though.

Now, if you told me that the productivity gain from 1 is enough to offset the loss from 2 and 3, I could believe you. But I also wouldn't be surprised if it didn't.

chillacy 4 hours ago | parent [-]

As I work with Claude more and gain a feel for its capabilities, I tend to run into 2 far less often, as I'll decompose my messages more for the current model limitations. The threshold also changes each release.

techblueberry 2 hours ago | parent | prev [-]

I’m going back to being a holdout, but it’s nuanced - My theory into why LLMs don’t lead to the colloquial definition of productivity would be something like - if code was never the bottleneck than generating code faster doesn’t result in more meaningful output.

Even if you take for granted that AI is as good as the best people say in writing code. And Ive spent a lot of time generating codes, I won’t disagree - Then the question becomes - does this change your daily incentives such that you reach for code as the solution to your problems rather than something else (coordinating with your colleagues? Product management? Planning and Design?

So from a holistic perspective, I think intentionally limiting your own AI usage is the best approach for maximum long-term productivity.

nyeah 7 hours ago | parent | prev [-]

That's possible, sure. But I think the answer is more likely in the numbers, not in just qualitatively saying AI isn't worth anything. Like if I pay $30k for an ounce of gold, I got value. Gold is worth something. But that amount of gold wasn't worth what I spent.

EDIT: In fact, parent comment has a link to some numbers.

[EDIT: Most] people don't want to go through the numbers. Ok. But there's a history here. When people don't want to see the numbers, certain kinds of things tend to happen.

oudlys 6 hours ago | parent [-]

I've posted numbers that indicate that productivity is becoming decoupled from value delivery. If you follow the link in my comment it reviews a pretty robust study of 4000 teams over 2 years. There is no product throughput increase.

d33d 6 hours ago | parent | next [-]

Yep.

Code acceleration is great, but.... something precedes that. Vision and strategy re. expansion of offerings and businesses. Once a firm reaches maturity in what it offers and is only touching the edges - this code acceleration is literally useless when you factor in all of the trade-offs.

This is a good thing - it means fat and slow incumbents are sitting ducks to be out-witted by creative and imaginative founders, which is healthy for a well-functioning economy.

Now the economics of existing frontier models are not sustainable - its looking like a mix of the airline (supersonic vs subsonic) and EV industry with China in the background providing decent offerings at much lower prices.

oudlys 4 hours ago | parent [-]

I think its worse than that.

I admit that if a small team or an individual uses an LLM, it's likely they can create value faster.

I think as soon as you don't own the responsibility for the defects you generate with an LLM, their use starts to destroy value. Regardless of product maturity.

This is what I think the data says.

https://unessays.substack.com/p/talk-is-cheap

nyeah 3 hours ago | parent | next [-]

Yeah this part scares me a little. I imagine it scares everyone who is more than a couple of years out of school. I hear that "the solution to LLM tech debt is more LLM." That might be true, but it might not be.

oudlys 3 hours ago | parent [-]

It scares me too.

I actually think this is precisely the reason LLMs can't be the basis for a technological revolution. Because it's only one way.

Like, if you have a compiler, and it has a bug. You can discover if that bug is influencing your code execution and patch it. You can go both up and down the stack.

With LLMs, there is no way to patch it's translation function. You have to rely on it to forward process.

I don't think there is any way to avoid us understanding our tech stacks.

d33d 3 hours ago | parent | prev [-]

You're not really getting it.

If you are producing something that delivers a far better experience, irrespective of what's under the hood (see Claude Code et al), you will decimate an incumbent who is trying to use LLMs in the context of incrementally improving a mature product.

LLMs are suited for the development of revolutionary innovation, not incremental.

oudlys 3 hours ago | parent [-]

I think we mostly agree.

I think I just disagree about the power of the LLM to deliver revolutionary innovation. That's something you do. Not the machine.

And, pretty soon on your journey to scale, the LLM becomes a hinderance rather than a help.

nyeah 6 hours ago | parent | prev [-]

Interesting data, thanks.

lbrito 4 hours ago | parent | prev | next [-]

>undeniable, massive productivity gains.

How can something so undeniable have zero scientific evidence? Are there any large peer reviewed or meta studies confirming your claim?

aspenmartin 4 hours ago | parent | next [-]

It’s a very hard experiment to run. You have a population that’s already “treated”. You can’t blind them to the fact that they’re using AI tools. It’s hard to imagine a study that wouldn’t have serious flaws that people would then use to dismiss and form their own conclusions. Sure you have METR but that was very low n with a very old model.

I think the surest sign of productivity gains is the sheer volume of adoption. If you look beyond headlines, adoption is just incredible. Of course adoption does not necessarily point to productivity gains, but if this was some sort of FOMO or smoke and mirrors you would not see this much retention and this feverish a pace of adoption. You would not see a large segment of the profession using coding agents exclusively. All of these companies track productivity, again with imperfect proxies, yet everything points to a pretty consistent picture. Same with benchmarks, again a lot of crappy benchmarks but a lot of high quality ones too and a very diverse collection of tasks and capabilities they probe.

48terry 4 hours ago | parent [-]

Your second paragraph appears to be 3 different instances of saying "X does not necessarily point to productivity gains... but in the case of AI, X definitely means productivity" without really saying why that is true or why other explanations do not fit.

Adoption meaning productivity supposes there are no other dominant factors for the AI push nor AI retention. It is possible for practices to be picked up or continued in spite of causing productivity DROPS. What studies have suggested are factors that make for productive work environments and what is actually enforced in the workplace are different things.

aspenmartin 4 hours ago | parent [-]

It’s 3 different weak but complimentary proxies. We form beliefs from imperfect evidence and I find these fairly convincing when it’s hard to find any hard evidence of no productivity and exactly the scenario you would expect under the hypothesis that we do see productivity gains. None of this is supposed to be unassailable. I would challenge then if you disagree what the evidence you have for this is?

Adoption implying at least some significant productivity gains doesn’t contradict there being other factors. You’re seeing entire companies reshaped. The argument is this is all for show or CEOs are in some sort of idiot class?

“It is possible for practices to be picked up or continued in spite of causing productivity drops” well of course. I just find that incredibly far away from Occam’s razor.

My point is: we have lots of evidence that’s highly consistent with real productivity gains, and I don’t see many pieces of evidence to the contrary.

_aavaa_ 4 hours ago | parent | prev [-]

Because even in a field like software engineering where the output of our work is save in version control, measuring baseline productivity is hard.

LoC: people argue it’s not what’s important

PRs/day: same as LoC

Getting projects done faster: oh but what about the quality.

Solve the technical problems and actually be more productive, the social systems build around the old way of doing things will hole you back.

Finish a PR in 10 minutes doesn’t matter if you’re waiting days for a human review.

frisbee6152 7 hours ago | parent | prev | next [-]

He’s been continuously predicting that the collapse was just around the corner, that progress was slowing, and that there was no market for inference, since 2024.

The fact he’s never reflected on the glaring failures in his analysis tells what we need to know about his intellectual integrity. There’s truth in some of his words about financial risk, but if you can’t acknowledge that there’s upside too, you can’t evaluate risk properly either.

I find it difficult to take him seriously.

dofm 3 hours ago | parent | next [-]

Progress is slowing, in an important way.

Have a muck about with what Qwen 3.6 or Gemma 4 can do and you'll see. I mean this as an illustration but Qwen just isn't as far behind as I expected, and compared to the data centre hardware it will run on a potato.

The frontier models are losing their undeniable edge over that which is unmetered.

And even putting aside my optimism for the smaller open weights models, there's a huge amount of scope for the larger, hosted open weights models that are only just behind the cutting edge and which cost, what, 1/25th of the price on opencode go, openrouter etc.

Commodification is coming, and with it slimmer profit margins; it's hard to see them making anywhere near the kind of money they need to in a commodified market.

solomatov 6 hours ago | parent | prev | next [-]

> progress was slowing

Do you think it's not slowing? Do I miss anything really important?

My understanding is that we have now is incremental improvement on thinking models which appeared more than a year ago. Of course, a breakthrough might happen, but I don't see one yet.

frisbee6152 5 hours ago | parent [-]

The most important thing I would point to is Mythos et al and the wave of vulnerabilities that have been discovered in the past couple months. It’s a completely unprecedented event, brought forth almost entirely by improvements in the models themselves. That said. keep in mind, I’m talking about over the past two years. With Claude code and the capabilities gained since December of last year, there have been incredible gains in the capabilities that are now available. Demand for inference is higher now than it was a year ago, because capability has improved. A specific criticism that I would hold is that claiming that progress with LLMs is slowing, prior to that point, is embarrassingly wrong in my view. One could argue that the model capability improvements are slowing, and all the improvements were in harnesses. I think that’s a stronger argument, but I have a few problems with it. 1. Utility is utility. Whether that comes from the model or the harness is irrelevant when making claims about utility. I don’t think that’s a useful distinction most of the time, but especially when talking about the technology as a whole. 2. Marginal intelligence gain is different than marginal utility gain. It’s estimated that intelligence grows logarithmically relative to investment. However, the utility of a marginally more intelligent model may grow exponentially, because once behavior crosses a reliability threshold, it unlocks new capabilities. 3. Even on those terms, it’s not clear to me that frontier capabilities are slowing down. With Mythos and its contemporaries, we have been seeing a vast change in the security industry as vulnerabilities are discovered at an unprecedented rate. OpenBSD vulnerabilities, more Firefox vulnerabilities found in a single month than the past two years, critical Linux vulnerabilities. It’s hard for me to look at the effects there, a radical new capabilities baked into the model itself, and see stagnation. A part of the reason it might feel like it’s slowing down is because we plebs don’t have access to the top models.

lompad 5 hours ago | parent | next [-]

The maintainer of curl - who has access to mythos - disagrees [0].

I think it's dangerous to rely on claims made by people who financially profit from you believing them without checking.

[0]: https://daniel.haxx.se/blog/2026/05/11/mythos-finds-a-curl-v...

frisbee6152 3 hours ago | parent | next [-]

The article says in the second section that the author did not have access to Mythos. I think it’s dangerous to rely on claims made by others without even bothering to read them first, let alone check.

It found hundreds of vulnerabilities in Firefox, according to Mozilla: how does Mozilla benefit? It found a 27 year old vulnerability in OpenBSD. How do they benefit from that? Is that made up? Are the maintainers of those codebases lying for the benefit of Anthropic’s IPO? Is copy fail a fabrication by big AI? The 12 OpenSSL vulnerabilities found in January?

https://venturebeat.com/security/mythos-detection-ceiling-se... https://www.wired.com/story/mozilla-used-anthropics-mythos-t... https://cyberscoop.com/copy-fail-linux-vulnerability-artific... https://www.schneier.com/blog/archives/2026/02/ai-found-twel...

Im not sure whose claims you think I’m relying on. I trust Firefox that they’re not overstating the number of CVES they’ve found. Same for OpenSSL. The OpenBSD folks definitely don’t seem like the types. I’ve not known Linux to fabricate CVEs either. I think my sources are fine.

jsnell 4 hours ago | parent | prev [-]

That blog post is very clear about the maintainer having no access to Mythos.

IsTom 4 hours ago | parent [-]

Does that matter that somebody else ran it for him?

jsnell an hour ago | parent [-]

When it is explicitly an appeal to authority, and the basis for the authority is incorrect? Feels like it matters.

And presumably the GP thought that saying the maintainer had access to Mythos made it a more compelling argument. Otherwise why even mention it?

slopinthebag 5 hours ago | parent | prev [-]

Do you have access to Mythos?

frisbee6152 3 hours ago | parent [-]

Nope. Just watching the volume and severity of CVEs coming through since it’s been running. It’s been a busy few months.

mschuster91 3 hours ago | parent | prev | next [-]

> He’s been continuously predicting that the collapse was just around the corner, that progress was slowing, and that there was no market for inference, since 2024.

Old WSB saying: The market can remain irrational for (far) longer than one can remain solvent.

And unfortunately, a lot of the market on the "buyer" side has been acting irrationally. When you see CEOs telling their employees that they don't care about token cost, only about "how much AI do you use" because that is what the stock market wants to hear - that's when you know we're all getting cooked, the question is how long it takes until the bubble bursts.

bdangubic 7 hours ago | parent | prev [-]

anyone that takes him seriously at this point... I don't want to say very bad words here...

ndnjdjdkdjd 6 hours ago | parent [-]

[dead]

gdcbe 7 hours ago | parent | prev | next [-]

I do not disagree with what you are saying, but I honestly still believe that most of the utility we experience are honestly gonna become very boring very soon that we can just run local... Even if it's a bit more slow who cares, can just run in background while you work on other stuff yourself, read up on things, review other work...

It's not that the utility of it put in question. What is however a giant question mark is how the heck any of the big AI companies are ever gonna get that ROI? Given how many of us are becoming more and more fine with local models that run just fine especially on a good enough computer which most developers have anyway...

cogman10 7 hours ago | parent [-]

Even more dangerous to the big 2 AI companies is the fact that the 20 different Chinese companies are catching up fast and for a lot lower cost.

Why should someone pick Opus 4.8 when Qwen3.7 Plus produces similar results for about 1/20th the cost.

That sort of pricing disparity is across the board. But further it's becoming more and more apparent that they are doing more with less parameters. That's what's giving the local models their super powers.

remich 4 hours ago | parent [-]

Because it doesn't. Not for the tasks where using Opus instead of a lower tier model is appropriate, at any rate. Benchmarks show this, as do revealed preferences of actual users. To believe that Qwen is as capable as Opus at 1/20 the cost you have to believe that every person who does not make the choice to use Qwen over Opus for a given task is some mix of ignorant or delusional. This is certainly an opinion you can hold about other engineers, but it's definitely a questionable one at best.

cogman10 4 hours ago | parent [-]

The benchmarks between the two are close and the engineers that have used both (like myself) can attest that the differences aren't so wide as you might believe.

I'd say that yes, ignorance plays a role here because a decent number of engineers are looking strictly at the benchmarks and choosing Opus just for that reason.

But I'd also say that a major factor for Opus use is because Opus is being purchased for the engineers by their employers. They don't get to pick which models they are using.

dofm 3 hours ago | parent | prev | next [-]

He has recently made the very good point that actually, the FAANG companies are struggling to put any ROI numbers on that incredible ground-level utility.

Uber, for example, is so unclear there is any ROI, they are cutting their exposure pretty radically.

He points out that one single Anthropic customer — a payments provider — accidentally had to pay Anthropic $500M for one month of token spend.

That is half what Apple is reportedly paying Google for the supply side of their entire consumer AI strategy.

squidsoup an hour ago | parent [-]

It doesn't matter under Capitalist Realism, the banks were bailed out, the AI companies will be bailed out, and you will pay for it. There is no alternative.

demorro 4 hours ago | parent | prev | next [-]

They are absolutely deniable. Huge swathes of people deny them.

elorant 7 hours ago | parent | prev | next [-]

Even if we assume that everything you said holds true, how is that we as a crowd can make viable a service that eats some $300bn annually in infrastructure costs? Where would that money come from? Most tech companies these days are cutting their AI budgets because the per token pricing is killing them.

aspenmartin 4 hours ago | parent [-]

Cite a real source for that last bit, I don’t think that is true. Also the budgets should be cut the spend at some places goes beyond any reasonable amount. The strategy there is to hook everything in and find the right processes, then cut the rest. Things then get better and better with each model release.

The way you make a viable service that eats 300bn annually is to have enough demand to service that. Anthropic underbought compute. That tells you something.

elorant 3 hours ago | parent | next [-]

https://www.theverge.com/tech/930447/microsoft-claude-code-d...

https://finance.yahoo.com/sectors/technology/articles/ai-bin...

https://blog.pragmaticengineer.com/the-pulse-token-spend-bre...

jazzcomputer 26 minutes ago | parent | prev [-]

When you say "Things then get better and better with each model release."

How far behind are models that can be run locally, and do you expect that this will be widespread?

PedroBatista 5 hours ago | parent | prev | next [-]

> undeniable, massive productivity gains.

The jury is still out on that.

deaton 4 hours ago | parent [-]

Yeah they're very much deniable. Raw LOC/hr is much higher, and putting together a MVP, but I've yet to see any evidence that an LLM is capable of doing anything unsupervised, and if you need a human supervising everything it does... why bother having an LLM in the first place?

aspenmartin 4 hours ago | parent [-]

Because it can perform much faster? Monitoring allows you to multitask more effectively. I would also disagree that you can’t one shot anything…claims like this are weak and I have enough counter examples in my own life that it’s trivially false. The question is more: can it one shot the right things with a low enough failure rate for it to be a good replacement. It’s hard to figure that out a priori.

cm277 6 hours ago | parent | prev | next [-]

Agreed that he has an extreme POV (or more accurately that he trolls for views/subscriptions). But his central argument is valid: if AI underdelivers financially, this bubble will burst and this bubble is magnitudes larger than what we've seen before, so there could be very rough seas ahead.

The question is: what does "underdeliver" mean here? the pro-AI arguments I am seeing in this thread are equating mass adoption to agentic coding. Er, I dont know of any trillion dollar cap companies that sell dev tools. The point is Zitron doesn't have to be 100% right for his central prediction to come true.

aspenmartin 4 hours ago | parent [-]

I don’t get this. We already have an insane demand. And yes exactly, this is primarily just with coding agents, but are you aware of what’s coming down the pipeline? It’s not hard to be you just have to find a decent way to keep up with literature.

* robotics (need to close data gap and release first viable product to get a data flywheel)

* conversational ai (no one is ready for this and we’re getting closer and closer to natural speech. The quality still isn’t good enough but it’ll be soon).

* other agentic use cases, openclaw adoption was crazy and that had a ton of barriers to entry

* ai products, like the one OpenAI is working on with Johnny Ive

Anyone thinking it’s unreasonable to hit whatever revenue requirements is just not that aware of what’s happening. Not to mention were capacity constrained already!! This is barely speculation at this point.

sterlind 4 hours ago | parent [-]

I don't think the issue with robotics is a data gap. maybe somewhat, but the real issues are that:

- RL is extraordinarily sample-inefficient.

- distribution shift/catastrophic forgetting aren't solved. only off-policy learning with giant decorrelated batches works.

- the breakout success of transformers as an architecture doesn't neatly translate to robot motion policy models.

the field is missing fundamental breakthroughs.

I also find it very interesting that conversational AI has taken this long. where are the models with good turn-taking? passive listening? the ability not to respond in paragraphs? has Anthropic simply not gotten around to it?

aspenmartin 4 hours ago | parent [-]

All of these points are great. The first one motivates world models which lots of labs work on. Not many people tend to understand the strategic value of those “open world” or interactive generation models: its robotics and planning. But also like you say you’re right, there are complicated problems to solve and it’s not totally clear the timeline. But where there’s data and compute, there’s a way.

For conversational AI these labs do have lots of things to do lol but you’re right; it likely also requires some architectural improvements but you see the infancy: look at the llama4 speech duplex model. Very unimpressive yet all of the components are there. Just a matter of pushing on them, licensing and commissioning better data, etc. takes time and compute is stretched thin.

Leynos 2 hours ago | parent | prev | next [-]

I quite like my mechanical spider from Wild Wild West and the coffee it makes with a 50% success rate

freejazz 7 hours ago | parent | prev | next [-]

Every day people here debate whether or not there are any actual productivity gains from LLM, and it's only in the limited context of software development. While I understand that this place obviously skews heavily towards the software industry, the notion that LLMs are anywhere near as useful in other industries is hubristic (at best).

remich 4 hours ago | parent [-]

Perhaps they aren't, but not currently viable !== always unviable.

amlib 2 hours ago | parent | next [-]

Is it really worth it to cause a global economical collapse and harm society well-being to an unimaginable degree just to find out if it is viable?

Why cant it naturally grow and prove it's worth?

48terry 4 hours ago | parent | prev | next [-]

Just 5 more years and $500 billion more, bro. We're still so early.

freejazz 3 hours ago | parent | prev [-]

And?

themafia 5 hours ago | parent | prev | next [-]

> through undeniable, massive productivity gains.

And where are those? They seem particularly hard to actually observe and only appear in anecdotes.

> I'm trying to believe

For every exponential increase in compute capacity you see linear gains in output accuracy. This is a death spiral. Anyways, you see "massive productivity gains" so why is "belief" a function of your viewpoint?

mawadev 7 hours ago | parent | prev | next [-]

I really like some good drama slop that reads like a thriller, it is entertaining. I don't take any of it THAT serious, but lately with the IPOs that are about to hit the indizes, he has gained a lot of attention. If you look around the internet, most people publish a negative angle on something and then extrapolate it into some grand conspiracy, which is really captivating. Its crazy when you enter some echo chamber you never engage with (movies, gaming, art/comics) and they have their own head cannon for why the world is bad and collapsing. It puts your echo chamber into perspective to see the same patterns of argumentation and presentation spin out in a different way

bakugo 5 hours ago | parent | prev | next [-]

> undeniable, massive productivity gains.

Just because you keep repeating something doesn't make it an undeniable truth.

enraged_camel 7 hours ago | parent | prev | next [-]

Yes. Zitron has been predicting and begging for collapse since 2024. It's not just his brand at this point. It's his entire identity. As such, he cannot back down, he cannot question himself, and he cannot accept any other viewpoint. And he will keep moving his goal posts until something happens that can make him go "aha! I told you guys!!"

This, combined with his extreme ignorance, makes him unreadable. The only reason people read his stuff is because it validates and confirms their own anti-AI beliefs. It's why every time he publishes an article, it reaches the front page in an hour or less.

nozzlegear 7 hours ago | parent [-]

> This, combined with his extreme ignorance,

Extreme ignorance?

AlexandrB 7 hours ago | parent | prev | next [-]

> undeniable, massive productivity gains

How are they undeniable? They're very deniable. One example is the (seemingly) increasing maintenance costs for AI-generated code[1]. Another is the cost incurred by everybody reading AI slop instead of actual communication.

I don't have hard data as to whether these cancel out the benefits, but it's not as rosy as some seem to think.

[1] After years of people understanding that LOC is not only a poor productivity metric but also a negative indicator of code quality (shorter code for the same thing is better), we now have people touting how many LOC their LLM agent is generating. It's like everyone forgot what LOC actually represents and what it means for long term maintenance costs.

dist-epoch 7 hours ago | parent | prev | next [-]

> Zitron is begging for a collapse at this point

No, he's not, he's making tons of money every month from his Substack subscriptions. In fact, the AI bubble popping would be the worse thing ever for him, he would be out of a job.

Just like the who have predicated the US dollar will collapse any-moment-now and which pushed gold for decades.

Funny how people always say "oh, you are an AI lab, of course you are going to hype AI", but never "oh, you make sooo much money from predicting the collapse of the AI bubble..."

ndnjdjdkdjd 6 hours ago | parent | prev | next [-]

[dead]

ath3nd 4 hours ago | parent | prev | next [-]

[dead]

alexashka 7 hours ago | parent | prev | next [-]

[flagged]

vanuatu 5 hours ago | parent [-]

i don't think this comment contributes much to the discussion. can you elaborate more than saying "no"?

4 hours ago | parent [-]
[deleted]
righthand 7 hours ago | parent | prev | next [-]

[flagged]

selectodude 7 hours ago | parent | prev [-]

[flagged]