| ▲ | honeycrispy 3 days ago | |||||||||||||||||||||||||||||||
A couple weeks ago I had Opus 4.5 go over my project and improve anything it could find. It "worked" but the architecture decisions it made were baffling, and had many, many bugs. I had to rewrite half of the code. I'm not an AI hater, I love AI for tests, finding bugs, and small chores. Opus is great for specific, targeted tasks. But don't ask it to do any general architecture, because you'll be soon to regret it. | ||||||||||||||||||||||||||||||||
| ▲ | tda 3 days ago | parent | next [-] | |||||||||||||||||||||||||||||||
Instead you should prompt it to come up with suggestions, look for inconsistencies etc. Then you get a list, and you pick the ones you find promising. Then you ask Claude to explain what why and how of the idea. And only then you let it implement something. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | thousand_nights 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
these models work best when you know what you want to achieve and it helps you get there while you guide it. "Improve anything you can find" sounds like you didn't really know | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | oncallthrow 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
In my experience these models (including opus) aren’t very good at “improving” existing code. I’m not exactly sure why, because the code they produce themselves is generally excellent. | ||||||||||||||||||||||||||||||||
| ▲ | sothatsit 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
I like these examples that predictably show the weaknesses of current models. This reminds me of that example where someone asked an agent to improve a codebase in a loop overnight and they woke up to 100,000 lines of garbage [0]. Similarly you see people doing side-by-side of their implementation and what an AI did, which can also quite effectively show how AI can make quite poor architecture decisions. This is why I think the “plan modes” and spec driven development are so important effective for agents, because it helps to avoid one of their main weaknesses. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | enraged_camel 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
>> A couple weeks ago I had Opus 4.5 go over my project and improve anything it could find. It "worked" but the architecture decisions it made were baffling, and had many, many bugs. So you gave it an poorly defined task, and it failed? | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | vbezhenar 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||
I'm using AI tools to find issues in my code. 9/10 of their suggestions are utter nonsense and fixing them would make my code worse. That said, there are real issues they're finding, so it's worth it. I wouldn't be surprised to find out that they will find issues infinitely, if looped with fixes. | ||||||||||||||||||||||||||||||||
| ▲ | rleigh 3 days ago | parent | prev [-] | |||||||||||||||||||||||||||||||
I've found it to be terrible when you allow it to be creative. Constrain it, and it does much better. Have you tried the planning mode? Ask it to review the codebase and identify defects, but don't let it make any changes until you've discussed each one or each category and planned out what to do to correct them. I've had it refactor code perfectly, but only when given examples of exactly what you want it to do, or given clear direction on what to do (or not to do). | ||||||||||||||||||||||||||||||||