Remix.run Logo
jackfranklyn a day ago

davydm nails it. The gap isn't in generating code - it's in everything else that makes software actually work.

I've been building accounting tools for years. AI can generate a function to parse a bank statement CSV pretty well. But can it handle the Barclays CSV that has a random blank row on line 47? Or the HSBC format that changed last month? Or the edge case where someone exports from their mobile app vs desktop?

That's not even touching the hard stuff - OAuth token refresh failures at 3am, database migrations when you change your mind about a schema, figuring out why Xero's API returns different JSON on Tuesdays.

The real paradox: AI makes starting easier but finishing harder. You get to 80% fast, then spend longer on the last 20% than you would have building from scratch - because now you're debugging code you don't fully understand.

estimator7292 a day ago | parent | next [-]

When I got hired at my current job, they handed me an AI generated app. It did a pretty reasonable job on the frontend, I think (I'm not a React guy), but the backend was a disaster. Part of it involved parsing a file and they had somehow fed the AI a test file with the first 20B truncated. I can tell that the AI tried hard to force the parser to match the file spec and ended up inserting checks for magic byte values that made no sense.

It took me a few days to realize what was happening. Once I got some good files it was just a couple hours to understand the problem. Then three weeks untangling the parser and making it actually match the spec.

And then three months refactoring the whole backend into something usable. It would have taken less time to redo it from scratch. If I'd known then what I know now, I would have scrapped the whole project and started over.

dinfinity a day ago | parent | next [-]

Did you use AI to help you understand the code and what it was doing (incorrectly)?

vaylian a day ago | parent | prev [-]

The AI can generate a lot of Chesterton Fences. It's difficult to figure out why they are there and if they are needed.

nunez a day ago | parent | prev | next [-]

LLMs (well, most of the frontier and popular open-source models) are actually quite good at abiding by weird formats like this given that your prompt describes them clearly enough. The real problem is that you'll have to manually spot-check the results, as LLMs are also very good at adding random incorrectness. This can take just as long (or longer!) than writing the code + tests yourself.

kaycey2022 8 hours ago | parent | prev | next [-]

I would say that handling these edge cases for minor changes that a human can easily understand, but is very hard to program rules for, is exactly where AI is the perfect fit.

rtp4me a day ago | parent | prev | next [-]

Interesting, but isn't the real issue here how external systems can/will update their output at random? Given you are probably a domain expert in this situation, you can easily solve the issue based on past experience. But, what if a junior person encountered these errors? Do you think they have enough background to solve these issues faster than an AI tool?

KurSix 13 hours ago | parent | prev | next [-]

Exactly. That last 20% is engineering. Handling edge cases, integrating with quirky APIs, optimizing for performance under load. An LLM excels when all conditions are perfect, but the real world is a mess of imperfections

dimitri-vs a day ago | parent | prev | next [-]

As someone that's currently building accounting (and many many other) tools for myself: yes, it can.

But with a big fat asterisk that you: 1. Need to make it aware of all relevant business logic 2. Give it all necessary tools to iterate and debug and 3. Have significant experience with strengths and weaknesses of coding agents.

To be clear I'm talking about cli agents like Claude Code which IMO is apples and oranges vs ChatGPT (and even Cursor).

KellyCriterion a day ago | parent | prev | next [-]

No, it cant handle those perfectly - but it can help you to develop the required code to do that correctly much faster :-)

garden_hermit a day ago | parent [-]

This just returns us to the question — if it makes all these things so easy and fast, where are the AI-generated apps? Where is the productivity boost?

thunky a day ago | parent | next [-]

How do you expect this boost will appear?

People start announcing that they're using AI to do their job for them? Devs put "AI generated" banners all over their apps? No, because people are incentivised to hide their use of AI.

Businesses, on the other hand, announce headcount reductions due to AI and of course nobody believes them.

If you're talking about normal people using AI to build apps those apps are all over the place, but I'm not sure how you would expect to find them unless you're looking. It's not like we really need that many new apps right now, AI or not.

callc a day ago | parent | next [-]

Any metric that measures the amount of software delivered.

The link at the bottom of the post (https://mikelovesrobots.substack.com/p/wheres-the-shovelware...) goes over this exactly.

> Businesses, on the other hand, announce headcount reductions due to AI and of course nobody believes them.

It’s an excuse. It’s the dream peddled by AI companies: automate intelligence so you can fire your human workers.

Look at the graphs in the post, then revisit claims about AI productivity.

The data doesn’t lie. AI peddlers do.

ogogmad a day ago | parent [-]

Given the amount of progress in AI coding in the last 3 years, are you seriously confident that AI won't increase programming productivity in the next three?

This reminds me of the people who said that we shouldn't raise the alarm when only a few hundred people in this country (the UK) got Covid. What's a few hundred people? A few weeks later, everyone knew somebody who did.

rsynnott a day ago | parent | next [-]

Okay, so if and when that happens, get excited about it _then_?

Re the Covid metaphor; that only works because Covid was the pandemic that did break out. It is arguably the first one in a century to do so. Most putative pandemics actually come to very little (see SARS1, various candidate pandemic flus, the mpox outbreak, various Ebola outbreaks, and so on). Not to say we shouldn’t be alarmed by them, of course, but “one thing really blew up, therefore all things will blow up” isn’t a reasonable thought process.

wizzwizz4 a day ago | parent | prev [-]

AI codegen isn't comparable to a highly-infectious disease: it's been a lot more than a few weeks. I don't think your analogy is apt: it reads more like rhetoric to me. (Unless I've missed the point entirely.)

anorwell a day ago | parent [-]

https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...

From my perspective, it's not the worst analogy. In both cases, some people were forecasting an exponential trend into the future and sounding an alarm, while most people seemed to be discounting the exponential effect. Covid's doubling time was ~3 days, whereas the AI capabilities doubling time seems to be about 7 months.

I think disagreement in threads like this often can trace back to a miscommunication about the state today / historically versus. Skeptics are usually saying: capabilities are not good _today_ (or worse: capabilities were not good six months ago when I last tested it. See: this OP which is pre-Opus 4.5). Capabilities forecasters are saying: given the trend, what will things be like in 2026-2027?

wizzwizz4 a day ago | parent [-]

The "COVID-19's doubling time was ≈3 days" figure was the output of an epidemiological model, based on solid and empirically-validated theory, based on hundreds of years of observations of diseases. "AI capabilities' doubling time seems to be about 7 months" is based on meaningless benchmarks, corporate marketing copy, and subjective reports contradicted by observational evidence of the same events. There's no compelling reason to believe that any of this is real, and plenty of reason to believe it's largely fraudulent. (Models from 2, 3, 4 years ago based on the "it's fraud" concept are still showing high predictive power today, whereas the models of the "capabilities forecasters" have been repeatedly adjusted.)

bccdee a day ago | parent | prev | next [-]

The article provides a few good signals: (1) an increase in the rate at which apps are added to the app store, and (2) reports of companies forgoing large SaaS dependencies and just building them themselves. If software is truly a commodity, why aren't people making their own Jiras and Figmas and Salesforces? If we can really vibe something production-ready in no time, why aren't industry-standard tools being replaced by custom vibe clones?

thunky a day ago | parent [-]

> If we can really vibe something production-ready in no time, why aren't industry-standard tools being replaced by custom vibe clones?

That's a silly argument. Someone could have made all of those clones before, but didn't. Why didn't they? Hint: it's not because it would have taken them longer without AI.

I feel like these anti-AI arguments are intentially being unrealistic. Just because I can use Nano Banana to create art does not mean I'm going to be the next Monet.

bccdee a day ago | parent [-]

> Why didn't they? Hint: it's not because it would have taken them longer without AI.

Yes it is. "How much will this cost us to build" is a key component of the build-vs-buy decision. If you build it yourself, you get something tailored to your needs; however, it also costs money to make & maintain.

If the cost of making & maintaining software went down, we'd see people choosing more frequently to build rather than buy. Are we seeing this? If not, then the price of producing reliable, production-ready software likely has not significantly diminished.

I see a lot of posts saying, "I vibe-coded this toy prototype in one week! Software is a commodity now," but I don't see any engineers saying, "here's how we vibe-coded this piece of production-quality software in one month, when it would have taken us a year to build it before." It seems to me like the only software whose production has been significantly accelerated is toy prototypes.

I assume it's a consequence of Amdahl's law:

> the overall performance improvement gained by optimizing a single part of a system is limited by the fraction of time that the improved part is actually used.

Toy prototypes proportionally contains a much higher amount of the type of rote greenfield scaffolding that agents are good at writing. The sticker problems of brownfield growth and robustification are absent.

garden_hermit a day ago | parent | prev | next [-]

I would expect a general rise in productivity across sectors, but with the largest concentrated in the tech sector given the focus on code generation. A proliferation of new apps, new features, and new functionalities at a quicker pace than pre-AI. Given the hype, one would expect an inflection point in the productivity of this sector, but it mostly just appears linear.

I am very willing to believe that there are many obscure and low-quality apps being generated by AI. But this speaks to the fact that mere generation of code is not productive, that generating quality applications requires other forms of labor that is not presently satisfied by generative AI.

thunky a day ago | parent [-]

> A proliferation of new apps, new features, and new functionalities at a quicker pace than pre-AI

IMO you're not seeing this because nobody is coming up with good ideas because we're already saturated with apps. And apps are already releasing features faster than anyone wants them. How many app reviews have you read that say: "Was great before the last update". Development speed and ability isn't the thing holding us back from great software releases.

rsynnott a day ago | parent | prev [-]

I would expect a _big_ increase in the production of amateur/hobbyist games. These aren’t demand driven; they’re basically passion projects generally. And that doesn’t seem to be happening; steam releases are actually modestly _down_, say.

cheevly a day ago | parent [-]

Asset generation is hard.

KellyCriterion a day ago | parent | prev [-]

Its not productivity boosting in a sense of "you can leave 2h earlier", but in a sense of "you get more done faster", resulting in more stuff created. Thats my general assumption/approach for "using AI to code".

When it comes to "AI-generated apps" that work out of the box, I do not believe in them - I think for creating a "complete" app, the tools are not good enough (yet?). Context & co is required, esp. for larger apps and to connect the building blocks - I do not think there will be any remarkable apps coming out of such a process.

I see the AI tools just as a junior developer who will create datastructures, functions, etc. when I instruct it to do so: It attends in code creation & optimization, but not in "complete app architecture" (maybe as sparring partner)

samsullivan a day ago | parent | prev | next [-]

all of these problems are better articulated at the level you just explained them. the code for these issues is convoluted and is only of use when an entity (human or not) can actually manipulate the symbolic text that achieves that task. a random oauth stub is of 0 use to the most skilled programmers without documentation as to what contracts and invariants are. bits in a file is just a means

oxag3n a day ago | parent | prev | next [-]

Shit in'n'out. If a 'notech' throws a sample CSV at anyone and asks to create a parser for it, an engineer who doesn't care and LLM will give similar shitty result.

Parsers and data serialization in general is mature and more standardized area of software engineering. Can AI write a good parse? May be. Will it though?

nostrademons a day ago | parent | prev | next [-]

I've heard AI coding described as "It makes the first 80% fast, and the last 20% impossible."

...which makes it a great fit for executives that live by the 80/20 rule and just choose not to do the last 20%.

senordevnyc a day ago | parent | prev [-]

AI makes starting easier but finishing harder. You get to 80% fast, then spend longer on the last 20% than you would have building from scratch - because now you're debugging code you don't fully understand.

I run a SaaS solo, and that hasn't really been my experience, but I'm not vibe coding. I fully understand all the code that my AI writes when it writes it, and I focus on sound engineering practices, clean interfaces, good test coverage, etc.

Also, I'm probably a better debugger than AI given an infinite amount of time and an advantage in available tools, but if you give us each the same debugging tools and see who can find and fix the bug fastest, it'll run circles around me, even for code that I wrote myself by hand!

That said, as time has gone on, the codebase has grown beyond my ability to keep it all in my head. That made me nervous at first, but I've come to view it just like pretty much any job with a large codebase, where I won't be familiar with large parts of the codebase when I first jump into them, because someone else wrote it. In this respect, AI has also been really helpful to help me get back up to speed on a part of the codebase I wrote a year ago that I need to now refactor or whatever.