> Over the past year, I’ve watched engineers use AI to ship in days what used to take a team weeks.

No, you didn't. You watched engineers use AI to ship in days something that looks like what used to take a team weeks. After enough rounds of feature evolution, you'll realise that what they actually shipped isn't at all the same. Anthropic's C compiler, which also seemed like a good start that would have taken people much longer to deliver, ended up being impossible to turn into something actually workable.

In a year or so, software developed by "AI-native talent who can manage fleets of agents to drive outsized impact" - which is another way of saying people who ship code they don't understand and therefore haven't fixed the architectural mistakes the agents make - will become impossible to evolve, and then things will get very interesting.

AI can help software developers in many ways, but not like that.

▲

adamtaylor_13 15 minutes ago | parent | next [-]

I am an engineer. I hire other engineers. I run a company that ships usable software for small businesses.

We do this every day. I'm sorry to say, we are indeed shipping in days what used to take weeks.

▲

randallsquared an hour ago | parent | prev | next [-]

> In a year or so

Look at the best models from Spring 2025, and compare with now (and similarly for Springs 2024 and 2025). Armstrong and lots of others are betting that this trend will continue, and if it does, the LLMs will ship code the LLMs understand, and whether any human specifically understands any particular part will mostly not matter.

▲

pron an hour ago | parent | next [-]

And if the trend doesn't continue? I understand that a company with Coinbase's performance has little to lose and not many options, but many companies are in a better position.

The problem is that executives could take the 15-20% productivity boost and be content, but they read stuff like this, get greedy, and they don't understand the risk they're taking.

▲

atonse 28 minutes ago | parent | next [-]

Even if the trend doesn’t continue, the current models are very very good. They’re better than the average programmer in the industry, already.

	▲	zeroonetwothree 12 minutes ago \| parent [-]
		Maybe at some coding benchmark. Certainly not at actually shipping and maintaining production grade software.

▲

randallsquared an hour ago | parent | prev [-]

Agreed! That will be an... "interesting" outcome, if so, for a lot of these companies.

▲

bix6 an hour ago | parent | prev [-]

> and whether any human specifically understands any particular part will mostly not matter.

This is how I feel. It’s building things for me that work. I don’t care how it works under the hood in many cases.

▲

pron an hour ago | parent [-]

It's not about caring how it works. It's about caring that it keeps working at all even after you add stuff to it for a year or three (and nearly all software written by companies is software they evolve).

▲

bix6 an hour ago | parent [-]

And who’s to say it won’t? It’s working now. I’m adding stuff and it’s still working. Why won’t that continue in year 3?

	▲	pron 36 minutes ago \| parent \| next [-]
		If you carefully read the agent's output you'll see why. It adds layers upon layers of workarounds and defences that hide serious problems, until the codebase reaches a point where the agent can no longer understand it and work with it. All the tests pass right up until the moment when adding a feature or fixing a bug causes another bug, and then nothing and no one can save the codebase anymore.
	▲	titularcomment 8 minutes ago \| parent \| prev \| next [-]
		Maintaining software is like 80% of the job.
	▲	techblueberry 24 minutes ago \| parent \| prev [-]
		Because the API’s it uses will change? Nothing in tech is static. And that’s just going to get worse re: this whole AI thing.

▲

smrtinsert 21 minutes ago | parent | prev | next [-]

Yeah absolutely embarassing take. If I had a nickle for every time someone sent me some AI garbage that was supposedly "thoroughly vetted and cross checked agent output", I'd be at least a thousandaire (gotta keep it real).

There are strengths, but if you think its writing stream of code and just using it as is, I would LOVE to compete against you.

▲

tokioyoyo an hour ago | parent | prev [-]

I commented this yesterday, I’ll repeat it again - what do you guys think organizations that have heavily leaned into AI are shipping nowadays?

Most devs aren’t working on cutting edge, low level, mission critical systems. AI is great for that. Every company I personally know have been fast shipping features that are being used daily by millions of people for the past 7 months.

We have the same thing on my team, and we also understand the limitations of AI generated code. If you’re more or less experienced, you can easily see the “good” and “bad” sides of it. So you kinda plan it out in a way that you can “evolve AI generated software”. I wouldn’t say the same thing in 2025 January, but it’s much different times now. Things are already working.

▲

pron an hour ago | parent | next [-]

> If you’re more or less experienced, you can easily see the “good” and “bad” sides of it. So you kinda plan it out in a way that you can “evolve AI generated software”.

If you're truly "managing fleets of agents" there's no way you're able to sift through the good and the bad in the output. If your AI-generated code is evolvable (which is hard to tell right now) then you're not writing it with "fleets of agents". If you are writing it with fleets of agents, I would bet it's not evolvable; you just haven't reached the breaking point yet.

▲

Zetaphor an hour ago | parent | prev [-]

Most of the people making this argument vastly overestimate the quality of engineering and discipline that behind the software powering most corporations. CRUD apps are likely to be the most prominent type of application across industries, and most of them are crud

	▲	pron 39 minutes ago \| parent [-]
		If the code is really simple, it's cheap to read it. When people don't read it (and when they need to use "fleets of agents"), it's because it's not so simple, and then the people who trust the outcome are those who don't know what it is that they've committed into the codebase. Their logic is no more than: the system hasn't collapsed under the load of 50 (or 500) changes so it probably won't collapse under the load of the next 500 (or 5000). Because that's how engineered systems work, right? If they're fine under light stress, they're fine under heavier stress.