Remix.run Logo
jdlshore a day ago

> AI has been competent enough to code like the best human programmer

It’s really not. Opus 4.8 can’t produce good software design and it still makes straightforward implementation mistakes. Two errors it made in one day for me recently: it built the Cookie class I asked for without a name field—cookies have a name and a value—and it neglected to handle a case where a database could have multiple rows with the same id, just returning whatever came back first.

The “best human programmers” absolutely would not have made those mistakes. At worst, they would have asked if I really meant what they thought I meant.

montfort 20 hours ago | parent [-]

I understand your point, but what you're describing is exactly the kind of mistake even the best human programmer could make in a poorly managed environment. I'm concerned that since AI emerged, we've overestimated our programming abilities. The comparisons we make between our own work and AI are based on an assumption of absolute perfection that doesn't exist in reality. Bugs aren't an invention of AI; they're ours. All modern software engineering, testing systems, version control systems, and so on, were developed through years of dealing with our own mistakes. We don't make systems fault-tolerant by understanding that failures are external to our work. These failures are our doing, and now they're AI's doing too. We have to deal with applying to those agents the management that we previously applied among ourselves. The example you provide is very good, because you yourself, with your human mind that solves problems, suspect that the origin of the problem was poor communication, and you are very likely right, but just as if it were a human error, the programmer is responsible for their faulty code, but you are responsible for poor process management, and yes, the same applies to working with AI agents.

jdlshore 8 hours ago | parent [-]

Sorry, I have to disagree. People often react to criticisms of AI with “but people also make mistakes,” but that’s a whataboutism fallacy.

The statement was that AI is as good as the “best human programmer” and it’s quite obvious that it’s not. It makes inhuman mistakes on a regular basis because it’s not using human thinking. Blaming those mistakes on poor management is just sweeping the problems under the rug.

I don’t know the best way to work with AI, but I do know that we’ll only discover the best way if we’re honest about its capabilities. That includes not pretending it’s as good as the best human programmers.

montfort 6 hours ago | parent [-]

I suppose our opinions stem from different experiences. I don't expect AI to do all the work with just a paragraph of instructions. Some people do, and they get very poor results. I design large, complex systems based on microservices, and so far I haven't encountered any of the obvious and glaring errors that other users report. For each project, I've spent two or three weeks working on thousands of lines of specification documents, user stories, plans, and task lists, using DDD. My prompts consist of dozens of files with 10,000-20,000 lines in total. Because the implementation tasks are extensive and atomic, AI has worked very well for me in solving them.

My experience shows that AI can program like the best programmers; its code is very good when given precise instructions, just like a human. I've encountered problems elsewhere, such as anti-patterns in unwired modules, which are "large-scale" implementation errors. I'm resolving these thanks to an open source tool I'm building for AI cognitive governance, and it's yielded excellent results for me. The code produced at both small and large scales is high quality.

In my experience, people experiencing gross AI errors are doing so because they aren't giving it precise instructions. And by precise instructions, I don't mean a highly refined prompt or "vibe-coding"; I'm talking about instructions thousands of lines long, just like the ones we create when developing with human teams.

If two people are using the same model, and one reports that the AI "neglected to handle a case where a database could have multiple rows with the same ID", while the other says they can develop a huge microservices system with multiple databases without any major issues, perhaps one of them isn't using the tool optimally, based on my experience.

jdlshore 3 hours ago | parent [-]

Yeah, your experience is definitely different than mine. When I’m working with human teams, I don’t spend weeks giving them thousands of lines of precise instructions. We work incrementally, having fairly brief conversations to make sure we’re on the same page about the tasks we’re tackling, and then letting individual pairs work out the details of each task… which they do, because they’re experienced professionals.

For example, the project we were working on was to add support for reading a session cookie to a codebase that, up until now, had used a different kind of auth. Fairly straightforward, everybody knows what a session cookie is and how it works. In about 10 minutes, we decided on the big picture design elements (how it was going to fit into our existing system, what we needed to add/modify, etc.) and the corresponding tasks.

One of the things we wanted was an “UntrustedCookie” class to represent the cookie. It was meant to follow a pattern we had already established for other user-controlled input. Our HttpServerRequest object was going have a new getCookie() method that returned it.

This would have been about 30 min of work for a pair to implement, including tests. It’s pretty trivial. No further documentation is needed.

Anyway, I’m glad AI is working for you. My experience is that it often fails, and does so in ways that experienced humans don’t.