Remix.run Logo
the_mitsuhiko 14 hours ago

I'm going to shill my own writing here [1] but I think it addresses this post in a different way. Because we can now write code so much faster and quicker, everything downstream from that is just not ready for it. Right now we might have to slow down, but medium and long term we need to figure out how to build systems in a way that it can keep up with this increased influx of code.

> The challenge is to develop new personal and organizational habits that respond to the affordances and opportunities of agentic engineering.

I don't think it's the habits that need to change, it's everything. From how accountability works, to how code needs to be structured, to how languages should work. If we want to keep shipping at this speed, no stone can be left unturned.

[1]: https://lucumr.pocoo.org/2026/2/13/the-final-bottleneck/

fmbb 13 hours ago | parent | next [-]

I don’t think we can expect all workers at all companies to just adopt a new way of working. That’s not how competition works.

If agentic AI is a good idea and if it increases productivity we should expect to see some startup blowing everyone out of the water. I think we should be seeing it now if it makes you say ten times more productive. A lot of startups have had a year of agentic AI now to help them beat their competitors.

ej88 13 hours ago | parent | next [-]

We're already seeing eye-watering, blistering growth from the new hot applied AI startups and labs

Imo the wave of top down 'AI mandates' from incumbent companies is a direct result of the competitive pressure, although it probably wont work as well as the execs think it will

that being said even Dario claims a 5-20% speedup from coding agents, 10x productivity only exists in microcosm prototypes, or if someone was so unskilled oneshotting a localhost web app is a 10x for them

bwestergard 13 hours ago | parent | next [-]

"eye-watering, blistering growth from the new hot applied AI startups and labs"

Could you give us a few examples?

simonw 13 hours ago | parent | next [-]

Claude Cowork was apparently built in less than two weeks using Claude Code, and appears to be getting significant usage already.

sjaiisba 12 hours ago | parent | next [-]

Only a personal anecdote, but the humans I know that have used it are all aware of how buggy it is. It feels like it was made in 2 weeks.

Which gets back to the outsourcing argument: it’s always been cheap to make buggy code. If we were able to solve this, outsourcing would have been ubiquitous. Maybe LLMs change the calculus here too?

bwestergard 9 hours ago | parent | prev [-]

That's certainly a good example of a tool developed quickly thanks to AI assistance.

But coding assistance tools must themselves be evaluated by what they produce. We won't see significant economic growth through using AI tools to build other AI tools recursively unless the there are companies using these tools to make enough money to justify the whole stack.

I believe there are teams out there producing software that people are willing to pay for faster than they did before. But if we were on the verge of rapid economic growth, I would expect HN commenters to be able to rattle these off by the dozen.

ej88 12 hours ago | parent | prev [-]

claude code 1B+ arr

ant 10xing ARR, oai

harvey legora sierra decagon 11labs glean(ish) base10(infra) modal(infra) gamma mercor(ish) parloa cognition

regulated industries giving these companies 7/8-fig contracts less than 2 years from incorporation

sjaiisba 13 hours ago | parent | prev [-]

AI has been a lifesaver for my low performing coworkers. They’re still heavily reliant on reviews, but their output is up. One of the lowest output guys I ever worked with is a massive LinkedIn LLM promoter.

Not sure how long it’ll last though. With the time I spend on reviews I could have done it myself, so if they don’t start learning…

HWR_14 3 hours ago | parent [-]

> With the time I spend on reviews I could have done it myself, so if they don’t start learning…

Then? Your job is still to review their code. If they are your coworker, you can not fire them.

whoisthemachine 3 hours ago | parent [-]

Then just start rubber-stamping their code. Say you "vibe" read it.

simonw 13 hours ago | parent | prev [-]

OpenClaw went from first commit in late November to Super Bowl commercial (it's meant to be the tech behind that AI.com vaporware thing) in February.

(Whether you think OpenClaw is good software is kind of beside the point.)

qudat 36 minutes ago | parent | next [-]

OpenClaw is not going to be a thing in 6 months. The core idea might exist but that codebase is built on a house of cards and is being replicated in 10% of the code.

I don’t think anyone is arguing against code agents being good at prototypes, which is a great feat, but most SWE work is built on maintaining code over time.

fmbb 12 hours ago | parent | prev [-]

It’s very much not beside the point. Productivity is measured in how much value you get out from the hours your workers put in.

falcor84 12 hours ago | parent [-]

But that only gets you to a philosophical argument about what "value" is. Many would argue that being able to get your thing into a Super Bowl commercial is extremely valuable. I definitely have never built anything that did.

It's very much imperfect, but the only consistently agreed upon and useful definition of "value" we have in the West is monetary value, and in that sense, we have at least a few major examples of AI generating value rapidly.

fmbb 10 hours ago | parent [-]

OK but that also means VR was a success, and web 3, and NFTs.

falcor84 9 hours ago | parent [-]

Well, yes, these were definitely a success for some. And I personally still believe that VR will be a success in the longer-term.

In any case, I agree with the grandparent post about the distinction between being successful and good.

jdahlin 13 hours ago | parent | prev | next [-]

One of the most interesting aspects is when LLMs are cheap and small enough so that apps can ship with a builtin one so that it can adjust code for each user based on input/usage patterns.

awepofiwaop 3 hours ago | parent | next [-]

The clear intent is to stop allowing regular people to be able to compute...anything. Instead, you'll be given a screen that only connects to $LLM_SERVER and the only interface will be voice/text in which you ask it to do things. It then does those things non-deterministically, and slower than they would be done right now. But at least you won't have control over how it works!

chickensong an hour ago | parent [-]

Weather or not the intent is as nefarious as you suggest, that type of UI is going to be a boon for a lot of people. Most people on the planet are incredibly computer illiterate.

candiddevmike 13 hours ago | parent | prev | next [-]

If this could ever happen, there will be no point in GUI apps anymore, your AI assistant or what have you will just interact with everything on your behalf and/or present you with some kind of master interface for everything.

I don't see a bunch of small agents in the future, instead just one per device or user. Maybe there will be a fleeting moment for GUI/local apps to tie into some local, OS LLM library (or some kind of WebLLM spec) to leverage this local agent in your app.

bryanrasmussen 3 hours ago | parent | next [-]

>If this could ever happen, there will be no point in GUI apps anymore, your AI assistant or what have you will just interact with everything on your behalf and/or present you with some kind of master interface for everything.

sort of how the hammer is the most useful tool ever and all we have to do is to make every thing that needs doing look like a nail.

jdahlin 13 hours ago | parent | prev [-]

Agents will still have to communicate with each other, the communication protocols, how data is stored, presented and queried will be important for us to decide?

Will we stop using web browsers as we understand them today in the next few decades in favor of only interacting with agents? Maybe.

jazzypants 13 hours ago | parent | prev | next [-]

I've heard this referenced multiple times and I have yet to hear the value be clearly articulated. Are you saying that every user would eventually be using a different app? Wouldn't it eventually get to the point that negates the need for the app developer anyways since you would eventually be unable to offer any kind of support, or are we just talking design changing while the actual functionality stays the same? How would something like this actually behave in reality?

jdahlin 13 hours ago | parent | next [-]

I don't know!

These are valid points, taken to the extreme we will have apps that cannot be supported.

In short term, we already have SQL/reports being automated. Lovable etc is experimenting with generating user interfaces from prompts, soon we will have complete working apps from a prompt. Why not have one core that you can expand via a prompt?

I am currently studying and depending heavily on Anki, its been amazing to use Claude Code to add new functionality on the fly. Its a holy mess of inconsistent/broken UX but it so clearly gives me value over the core version. Sometimes it breaks, but CC can usually fix it within a prompt or two.

baal80spam 13 hours ago | parent | prev [-]

> I've heard this referenced multiple times and I have yet to hear the value be clearly articulated.

Me too, and I see this as _incredibly_ wasteful.

a_better_world 13 hours ago | parent | prev [-]

LISP returns!

coldtea 13 hours ago | parent | prev | next [-]

>but medium and long term we need to figure out how to build systems in a way that it can keep up with this increased influx of code.

Why? Why do we need to "write code so much faster and quicker" to the point we saturate systems downstream? I understand that we can, but just because we can, does'nt mean we should.

falcor84 12 hours ago | parent | next [-]

> to the point we saturate systems downstream

But that's point of TFA, no? Now that writing code is no longer the bottleneck, the upstream and downstream processes have become the new bottlenecks, and we need to figure out how to widen them.

As I see it, the end goal for all of this is generating software at the speed of thought, or at least at the speed of speech. I want the digital butler to whom I could just say - "I'm not happy with the way things happened to day, please change it so that from here on, it'll be like x" - and it'll just respond with "As you wish", and I'll have confidence that it knows me well enough and is capable enough to have actually implemented the best possible interpretation of what I asked for, and that the few miscommunications that do occur would be easy to fix.

We're obviously not close that yet, but why shouldn't we build towards it?

layer8 11 hours ago | parent [-]

> Now that writing code is no longer the bottleneck

I think it’s contestable that writing the code was ever the main bottleneck.

> As I see it, the end goal for all of this is generating software at the speed of thought, or at least at the speed of speech.

The question is what distinguishes that from having AGI, and if the answer is “nothing”, then that will change the whole game entirely again.

falcor84 11 hours ago | parent [-]

Oh, absolutely, my vision depends on AGI (and maybe even ASI), and I definitely agree that it'll be a whole new ball game.

the_mitsuhiko 13 hours ago | parent | prev [-]

If we want to continue to ship at that speed we will have to. I’m not sure if we should, but seemingly we are. And it causes a lot of problems right now downstream.

coldtea 13 hours ago | parent [-]

We were already rushing and churning products and code of inferior quality before AI (let's e.g. consider the sorry state of macOS and Windows in the past decade).

Using AI to ship more and more code faster, instead of to make code more mature, will make this worse.

simonw 13 hours ago | parent [-]

I want to use AI to ship more and more code faster and better. If AI means our product quality goes down we should figure out better ways to use it.

slopinthebag 2 hours ago | parent | next [-]

Shouldn't you want to ship less code that does more? Since when was LoC the relevant benchmark for engineering?

simonw 39 minutes ago | parent [-]

Less code isn't as important as it used to be, because the cost of maintaining (simple) code has gone down as well.

With coding agent projects I find that investing in DRY doesn't really help very much. Needing to apply the same fix in two places is a waste of time as a human. An agent will spot both places with grep and update them almost as fast as if there was just one.

It's another case where my existing programming instincts appear to not hold as well as I would expect them to.

slopinthebag 34 minutes ago | parent [-]

When you talk about maintaining code, do you mean having the LLM do it and you maintain a write-only codebase? Because if you're reading the code yourself and you have a bloated tangled codebase it would make things much harder right?

Is the goal basically a codebase where your interactions are mediated through an LLM?

coldtea 11 hours ago | parent | prev [-]

I'm betting on it meaning the product quality going down - and technical debt increasing, which will be dealt with more AI in a downward spiral. Meanwhile college CS majors wont ever bother learning the basics (as AI will handle their coursework, and even their hobby work). Then future AI will train on previous AI output, with the degredation that brings...

ehnto 2 hours ago | parent | prev | next [-]

I was having this conversation at work, where if the promise of AI coding becomes true and we see it in delivery speed, we would need to significantly increase the throughput of all other aspects of the business.

simonw 13 hours ago | parent | prev | next [-]

Totally agree - that's what I was trying to get at with "organizational habits". The way we plan, organize and deliver software projects is going to radically change.

I'm not ready to write about how radically though because I don't know myself!

username223 2 hours ago | parent | prev | next [-]

> If we want to keep shipping at this speed

Do we? Spewing features like explosive diarrhea is not something I want.

SignalStackDev 13 hours ago | parent | prev [-]

The linked article is worth reading alongside this one.

The thing I'd add from running agents in actual production (not demos, but workflows executing unattended for weeks): the hard part isn't code volume or token cost. It's state continuity.

Agents hallucinate their own history. Past ~50-60 turns in a long-running loop, even with large context windows, they start underweighting earlier information and re-solving already-solved problems. File-based memory with explicit retrieval ends up being more reliable than in-context stuffing - less elegant but more predictable across longer runs.

Second hard part: failure isolation. If an agent workflow errors at step 7 of 12, you want to resume from step 6, not restart from zero. Most frameworks treat this as an afterthought. Checkpoint-and-resume with idempotent steps is dramatically more operationally stable.

Agree it's not just habits - the infrastructure mental model has to change too. You're not writing programs so much as engineering reliability scaffolding around code that gets regenerated anyway.