| ▲ | skeeter2020 6 hours ago | ||||||||||||||||||||||
I spent about a week doing an "experiment" greenfield app. I saw 4 types of issues: 0. It runs way too fast and far ahead. You need to slow it down, force planning only and explicitly present a multi-step (i.e. numbered plan) and say "we'll do #1 first, then do the rest in future steps". take-away: This is likely solved with experience and changing how I work - or maybe caring less? The problem is the model can produce much faster than you can consume, but it runs down dead ends that destroy YOUR context. I think if you were running a bunch of autonomous agents this would be less noticeable, but impact 1-3 negatively and get very expensive. 1. lots of "just plain wrong" details. You catch this developing or testing because it doesn't work, or you know from experience it's wrong just by looking at it. Or you've already corrected it and need to point out the previous context. take-away: If you were vibe coding you'd solve all these eventually. Addressing #0 with "MORE AI" would probably help (i.e. AI to play/validate, etc). 2. Serious runtime issues that are not necessarily bugs. Examples: it made a lot of client-side API endpoints public that didn't even need to exist, or at least needed to be scoped to the current auth. It missed basic filtering and SQL clauses that constrained data. It hardcoded important data (but not necessarily secrets) like ports, etc. It made assumptions that worked fine in development but could be big issues in public. take-away: AI starts to build traps here. Vibe coders are in big trouble because everything works but that's not really the end goal. Problems could range from 3am downtime call-outs to getting your infrastructure owned or data breaches. More serious: experienced devs who go all-in on autonomous coding might be three months from their last manual code review and be in the same position as a vibe coder. You'd need a week or more to onboard and figure out what was going on, and fix it, which is probably too late. 3. It made (at least) one huge architectural mistake (this is a pretty simple project so I'm not sure there's space for more). I saw it coming but kept going in the spirit of my experiment. take-away: TBD. I'm going to try and use AI to refactor this, but it is non trivial. It could take as long as the initial app did to fix. If you followed the current pro-AI narrative you'd only notice it when your app started to intermittently fail - or you got you cloud provider's bill. | |||||||||||||||||||||||
| ▲ | Schiendelman 2 hours ago | parent [-] | ||||||||||||||||||||||
I'm a product manager, and a lot of the things I see people do wrong is because they don't have any product management experience. It takes quite a bit of work to develop a really good theory of what should be in your functional spec. Edge cases come up all the time in real software engineering, and often handling all those cases is spread across multiple engineers. A good product manager has a view of all of it, expects many of those issues from the agent, and plans for coaching it through them. | |||||||||||||||||||||||
| |||||||||||||||||||||||