Meanwhile, the complexity of the average piece of software is drastically increasing. ... The stats suggest that devs are shipping more code with coding agents. The consequences may already be visible: analysis of vendor status pages [3] shows outages have steadily increased since 2022, suggesting software is becoming more brittle.

We've already seen a large-scale AWS outage because of this. It could get much worse. In a few years, we could have major infrastructure outages that the AI can't fix, and no human left understands the code.

AI coders, as currently implemented, don't have a design-level representation of what they're doing other than the prompt history and the code itself. That inherently leads to complexity growth. This isn't fundamental to AI. It's just a property of the way AI-driven coding is done now.

Is anybody working on useful design representations as intermediate forms used in AI-driven coding projects?

"The mending apparatus is itself in need of mending" - "The Machine Stops", by E.M. Forster, 1909.

▲

switchbak 2 days ago | parent | next [-]

I think we're heading for a real crisis here. We've got an imperfect system of constraints and bottlenecks, and we've just eliminated one of the main bottlenecks - the speed at which we can add new code. This just puts so much more strain on the rest of the system, I think the industry is going to have a quick lesson on the non-linear costs of software complexity.

I'm glad to see that the author of the article is putting an emphasis on simplicity here, especially given the nature of their business. Those that fully embrace the "code doesn't matter" approach are in for a world of hurt.

Long-term, I expect there will be more tooling and model advancements to help us in this regard - and there will certainly be a big economic incentive for that soon. But in the meantime it feels like a dam has been breached and we're just waiting for the real effects to become manifest.

▲

Cthulhu_ 2 days ago | parent | prev | next [-]

I was curious about the claim about those vendor status pages, wondering if there's postmortems that actually single out AI. The source cited as [3] is a Reddit post with a poorly cropped chart, and it doesn't include any data from before 2022: https://www.reddit.com/r/sysadmin/comments/1o15s25/comment/n...

I'm not saying it's wrong, because I haven't actually looked for alternative sources, just that the source isn't great.

	▲	bootsmann 2 days ago \| parent [-]
		Amazon holds engineering meeting following AI-related outages - Financial Times https://www.ft.com/content/7cab4ec7-4712-4137-b602-119a44f77...

▲

mpweiher 2 days ago | parent | prev | next [-]

> AI coders, as currently implemented, don't have a design-level representation of what they're doing other than the prompt history and the code itself.

That new design-level representation will be code.

It will need to be code, because prompts, while dense, are not nearly deterministic enough.

It will need to be much higher level code, because current code, while deterministic, is not nearly dense enough.

▲

brandensilva 2 days ago | parent | prev | next [-]

Even the prompt history is notoriously weak given how little Claude Code and some of the others display to give developers confidence in the process.

There needs to me more design rep indeed.

▲

tdeck 2 days ago | parent [-]

Claude Code displays plenty in my opinion, if you make it ask you for approval before each code change. You can read the code as it's being built up and understand if it's going in a bad direction before it does that and then piles on more and more slop.

The trouble is people don't want to bother reviewing the changes.

	▲	LoganDark 2 days ago \| parent [-]
		Claude Code used to stream the thinking process in verbose mode. Now that has been replaced with "transcript mode" which doesn't actually give much more information and also doesn't stream anything. They also recently removed (in certain situations) the counter of how many tokens the model's generated in its response in progress, so the only way to tell if it's stuck is to wait 10 minutes and then retry. Sure, I can read the diffs as they're generated (and I do). But proper transparency goes further than that, and it's being stripped away.

▲

9dev 2 days ago | parent | prev [-]

While I also view this development critically, why do you assume AI will be unable to fix the issues eventually?

▲

yoyohello13 2 days ago | parent [-]

Whether they can or not in the future is kind of irrelevant. The fact is that right now they are not able to, but many are using them as if they are.

▲

mycall 2 days ago | parent [-]

I tend to disagree. When guiding AI through many rounds of code review, it can self correct if shown where general issues exist. It does take practice for using the language of the model, i.e. drift instead of issues. Human in the loop is good enough to produce useful and accurate code today.

	▲	shimman 2 days ago \| parent [-]
		If you can actually do this, please sell your services. You will become a multi-millionaire overnight if you can provide a workflow that doesn't result in mass hallucinations or incorrect suggestions you're able to do something no other LLM company can. The more common use case is that these tools struggle immensely on anything outside the happy path.