| ▲ | 6thbit 2 hours ago | |
Is this understanding correct: The LLM uses harness tools to ask for permission, then interprets the answer and proceeds. If so, this can't live 100% on the harness. First because you would need the harness to decide when the model should ask for permission or not which is more of an llm-y thing to do. The harness can prevent command executions but wouldn't prevent this case where model goes off and begins reading files, even just going off using tokens and spawning subagents and such, which are not typically prevented by harnesses at all. Second because for the harness to know the LLM is following the answer it would need to be able to interpret it and the llm actions, which is also an llm-y thing to do. On this one, granted, harness could have explicit yes/no. I like codex's implementation in plan mode where you select from pre-built answers but still can Tab to add notes. But this doesn't guarantee the model will take the explicit No, just like in OP's case. I agree with your hunch though, there may be ways to make this work at harness level, I only suspect its less trivial than it seems. Would be great to hear people's ideas on this. | ||
| ▲ | angry_octet an hour ago | parent [-] | |
Harness needs to intercept all too calls and compare with an authorisation list. The problem is that this is using already granted core permissions. So you have to have a tighter set of default scopes, which means approving a whole batch of tool calls, at the harness layer not as chat. This is obviously more tedious. The answer might be another tool that analyses the tool calls and presents a diagram of list of what would be fetched, sent, read and written. But it would get very hard to truly observe what happens when you have a bunch of POST calls. So maybe it needs a kind of incremental approval, almost like a series of mini-PRs for each change. | ||