Remix.run Logo
GardenLetter27 4 hours ago

Could the harness not check for a failed tool call and pass it to a small model for correction without clogging up the main context?

lambda 3 hours ago | parent | next [-]

The thing is, to do a proper fix it would really need all of the context (maybe the tool call that failed was for an edit to a file that was last touched way at the beginning of the context), so you'd need to either keep that smaller model running doing prompt processing all the time, or have a very long wait while it does prompt processing on your whole session.

And then also, sometimes the tool call errors are because of something like a file was changed out from under it; the larger model is probably going to do a better job of figuring that out and fixing it up.

Finally, in Pi, you can always just use the /tree command to skip back to before a series of failed tool calls, with a summary if you want to let the model know what happened. The Pi /tree command is pretty powerful in managing your context

everforward an hour ago | parent [-]

An illustrative example I've seen a lot is creating Jira tickets in projects with custom fields marked as mandatory. It tries to create the ticket without the field and the tool call fails. The LLM needs access to the full context so that it can generate text to put in the "Why couldn't this meeting be an email?" field.

Greenpants 4 hours ago | parent | prev [-]

I'm actually quite sure that directly retrying the tool call would often fix the edit-call already. But these models have been trained to "think" for a while for any problem solving, so they'll presume the problem of the edit is more fundamental and spend unnecessary tokens filling up the context.

I'll experiment more with the effectiveness of AGENTS.md rules for local Pi agents. I feel like smaller (local) LLMs just lack in attentiveness to elements in the context window, like precise instructions, compared to e.g. Claude models.