Remix.run Logo
jstanley 7 hours ago

Conversely, I often find coding agents privileging the existing code when they could do a much better job if they changed it to suit the new requirement.

I guess it comes down to how ossified you want your existing code to be.

If it's a big production application that's been running for decades then you probably want the minimum possible change.

If you're just experimenting with stuff and the project didn't exist at all 3 days ago then you want the agent to make it better rather than leave it alone.

Probably they just need to learn to calibrate themselves better to the project context.

_pastel 6 hours ago | parent | next [-]

The tradeoff is highly contextual; it's not a tradeoff an agent can always make by inspecting the project themselves.

Even within the same project, for a given PR, there are some parts of the codebase I want to modify freely and some that I want fixed to reduce the diff and testing scope.

I try to explain up-front to the agent how aggressively they can modify the existing code and which parts, but I've had mixed success; usually they bias towards a minimal diff even if that means duplication or abusing some abstractions. If anyone has had better success, I'd love to hear your approach.

mncharity 5 hours ago | parent [-]

Just brainstorming, but perhaps a more tangible gradient, with social backpressure? Imagine three identical patch tools: "patch", "submit patch", and "send patch to chief architect and wait". With the "where each can be used" explained or even enforced. Having the contrast of less-aggressive options, might make it easier to encourage more aggressive refactoring elsewhere. Or pushing the impact further up the CoT, "patch'ing X requires an analysis field describing less invasive alternatives and their un/suitability; for Y, just do it, refactor aggressively".

jauntywundrkind 3 hours ago | parent | prev | next [-]

To get the agent to think for itself sometimes it feels like I have to delete a bunch of code and markdown first. Instruction to refactor/reconsider broadly has such mild success, I find.

I'll literally run an agent & tell it to clean up a markdown file that has too much design in it, delete the technical material, and/or delete key implementations/interfaces in the source, then tell a new session to do the work, come up with the design. (Then undelete and reconcile with less naive sessions.)

Path dependence is so strong. Right now I do this flow manually but I would very much like to codify this, make a skill for this pattern that serves so well.

jdkoeck 2 hours ago | parent | prev [-]

Tell me you’re using Codex without telling me you’re using Codex :)