| ▲ | danabramov 16 hours ago | ||||||||||||||||
I believe the author explicitly suggests strategies to deal with this problem, which is the entire second half of the post. There’s a big difference between when you act as a human tester in the middle vs when you build out enough guardrails that it can do meaningful autonomous work with verification. | |||||||||||||||||
| ▲ | WhyOhWhyQ 16 hours ago | parent | next [-] | ||||||||||||||||
I'm just extremely skeptical about that because I had many ideas like that and it still ended up being miserable. Maybe with Opus 4.5 things would go better though. I did choose an extremely ambitious project to be fair. If I were to try it again I would pick something more standard and a lot smaller. I put like 400 hours into it by the way. | |||||||||||||||||
| |||||||||||||||||
| ▲ | irrationalfab 15 hours ago | parent | prev [-] | ||||||||||||||||
+1... like with a large enough engineering team, this is ultimately a guardrails problem, which in my experience with agentic coding it’s very solvable, at least in certain domains. | |||||||||||||||||
| |||||||||||||||||