Great job and congrats! Working on my own harness has been one of my favorite side projects in the past couple of weeks, of course I never finish anything... But I'm very interested in your experience with the following:

1. Context management - specifically pruning old tool call responses, truncation of tool output and automatic compaction. Those have worked pretty great for me, benefits of reducing context greatly seem to outweigh gains from "remembering" everything. I always leave short summaries though.

2. "Subagents" - my latest attempts revolve around not exposing any tools for the main agent at all, except for a run_agent tool where the subagent has access to the classic search/execute/fetch tools. My theory is that if subagents return concise summaries this would automatically keep the parent agent context clean for much longer. Still experimenting though, writing prompts for subagents may also be too far outside of the current training sets.

▲

GodelNumbering 2 hours ago | parent [-]

Thanks.

1. Context management - Don't bother with pruning unless your API doesn't support caching. Every prune breaks the cache and you lose the 90% discounted caching rate

2. I did some work improving Cline's subagent feature that Dirac inherited. In my experience, not all models are trained effectively to delegate work, so YMMV. A common pitfall to watch is, what happens if one or more subagents get stuck in a loop or for whatever reason don't return? You need a mechanism to control them from the main agent

	▲	hedgehog an hour ago \| parent [-]
		It depends where you prune and how the specific prefix cache you're targeting works. Pruning or condensing recent items that are unnecessary probably pays for itself.