▲ | sunir 2 days ago | ||||||||||||||||||||||||||||
All true if you one shot the code. If you have a sophisticated agent system that uses multiple forward and backward passes, the quality improves tremendously. Based on my set up as of today, I’d imagine by sometime next year that will be normal and then the conversation will be very different; mostly around cost control. I wouldn’t be surprised if there is a break out popular agent control flow language by next year as well. The net is that unsupervised AI engineering isn’t really cheaper better or faster than human engineering right now. Does that mean in two years it will be? Possibly. There will be a lot of optimizations in the message traffic, token uses, foundational models, and also just the Moore’s law of the hardware and energy costs. But really it’s the sophistication of the agent systems that control quality more than anything. Simply following waterfall (I know, right? Yuck… but it worked) increased code quality tremoundously. I also gave it the SelfDocumentingCode pattern language that I wrote (on WikiWikiWeb) as a code review agent and quality improved tremendously again. | |||||||||||||||||||||||||||||
▲ | theshrike79 2 days ago | parent | next [-] | ||||||||||||||||||||||||||||
> Based on my set up as of today, I’d imagine by sometime next year that will be normal and then the conversation will be very different; mostly around cost control. I wouldn’t be surprised if there is a break out popular agent control flow language by next year as well. Currently it's just VC funded. The $20 packages they're selling are in no way cost-effective (for them). That's why I'm driving all available models like I stole them, building every tool I can think of before they start charging actual money again. By then local models will most likely be at a "good enough" level especially when combined with MCPs and tool use so I don't need to pay per token for APIs except for special cases. | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
▲ | zarzavat 2 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
> If you have a sophisticated agent system that uses multiple forward and backward passes, the quality improves tremendously. Just an hour ago I asked Claude to find bugs in a function and it found 1 real bug and 6 hallucinated bugs. One of the "bugs" it wanted to "fix" was to revert a change that I had made previously to fix a bug in code it had written. I just don't understand how people burning tokens on sophisticated multi-agent systems are getting any value from that. These LLMs don't know when they are doing something wrong, and throwing more money at the problem won't make them any smarter. It's like trying to build Einstein by hiring more and more schoolkids. Don't get me wrong, Claude is a fantastic productivity boost but letting it run around unsupervised would slow me down rather than speed me up. | |||||||||||||||||||||||||||||
▲ | oblio a day ago | parent | prev [-] | ||||||||||||||||||||||||||||
> and also just the Moore’s law of the hardware and energy costs. What Moore's law? |