| ▲ | arthurjj 7 hours ago | |||||||||||||
LLMs not being lazy enough definitely feels true. But it's unclear to me if it a permanent issue, one that will be fixed in the next model upgrade or just one your agent framework/CICD framework takes care of. e.g. Right now when using agents after I'm "done" with the feature and I commit I usually prompt "Check for any bugs or refactorings we should do" I could see a CICD step that says "Look at the last N commits and check if the code in them could be simplified or refactored to have a better abstraction" | ||||||||||||||
| ▲ | ocrow 2 hours ago | parent | next [-] | |||||||||||||
I've tried this approach of instructing the LLM to look for opportunities to abstract, but it's not good at finding the commonalities after the fact, when possibly related functions have already diverted unnecessarily. It writes "sloppy" code, that is to say code that is locally correct but which fails to build towards overall generalizations, but that sloppy code is a cul-de-sac: easy to write, but adding to messiness, and really tough to improve. When a good programmer writes a new feature, they are looking for both existing and new abstractions that can be applied. They are considering their mental model of the whole system and examining whether it can be leveraged or needs to be updated. That's how they avoid compounding complications. In order to take a big picture view like that, the LLM needs the right context. It would need to focus on what its system model is and decide when to update that system model. For now, just telling it what to write isn't enough to get good code. You have to tell it what to pay attention to. | ||||||||||||||
| ||||||||||||||
| ▲ | layer8 6 hours ago | parent | prev | next [-] | |||||||||||||
It’s difficult to define a termination criterion for that. When you ask LLMs to find any X, they usually find something they claim qualifies as X. | ||||||||||||||
| ||||||||||||||
| ▲ | JeremyNT 6 hours ago | parent | prev [-] | |||||||||||||
I agree, it's not a fundamental characteristic but a limitation of how the tool is being used. If you just tell these things to add, they'll absolutely do that indiscriminately. You end up with these huge piles of slop. But if I tell an LLM backed harness to reduce LOC and DRY during the review phase, it will do that too. I think you're more likely to get the huge piles if you delegate a large task and don't review it (either yourself or with an agent). | ||||||||||||||