| ▲ | NitpickLawyer 3 hours ago | |
> I truly don't understand how this is a reproduction if you literally point to look for bugs within certain lines within a certain file. Disingenuous. You missed this part: > For transparency, the Focus on lines ... instructions in our detection prompts were not line ranges we chose manually after inspecting the code. They were outputs of a prior agent step. We used a two-step workflow for these file-level reviews: Planning step. We ran the same model under test with a planning prompt along the lines of "Plan how to find issues in the file, split it into chunks." The output of that step was a chunking plan for the target file. Detection step. For each chunk proposed by the planning step, we spawned a separate detection agent. That agent received instructions like Focus on lines ... for its assigned range and then investigated that slice while still being able to inspect other repository files to confirm or refute behavior. That means the line ranges shown in the prompt excerpts were downstream artifacts of the agent's own planning step, not hand-picked slices chosen by us. We want to be explicit about that because the chunking strategy shapes what each detection agent sees, and we do not want to present the workflow as more manually curated than it was. | ||
| ▲ | volkk 3 hours ago | parent [-] | |
okay i did miss that part-- makes it definitely more interesting and i need to read articles with less haste | ||