Remix.run Logo
mappu 6 hours ago

In my harness i implemented apply_patch just taking unified diffs for patch -p1. I was shocked to see how bad models are at generating them. I started logging diff failures to analyse -

- All models are terrible at generating line numbers for a proper diff, give up on them

- Some models (Owl-alpha) must have been post-trained on Codex transcripts, because they occasionally push its V4A patch format into any diff tool available

- Codex puts a lot of info in its system prompt about the desired patch style, making larger hunks instead of granular ones, etc

fractorial 5 hours ago | parent [-]

In my harness, I implemented tool_edit as a subset of Rob Pike’s Sam editor syntax [0].

Only need ~650 tokens of system prompt for it to work. It’s pretty stellar.

[0] https://9p.io/sys/doc/sam/sam.html