Remix.run Logo
HarHarVeryFunny an hour ago

I wonder if this could be usefully mitigated with a combination of prompt (prefix) caching and an agent that let you control what the prompt prefix consisted of. The goal would be to incur that slow prefill once to build the prompt cache, then have subsequent prompts consist of mostly this fixed prefix plus specific instructions.

For a language like C++ where modules are split into definition (.h) and implementation (.cpp) parts, one choice of prefix would be all the header files for the project (which aren't likely to change much).

More generally the idea would be to have an agent that had cached-prefix reuse as it's primary context management goal.

Another possibility, to support caching of files that have since changed, would be for the agent to build the context as a fixed prefix reflecting some or all of the codebase in its start-of-session state, then append any changes to that, with appropriate prompting to only use the latest definition of a function.

e.g.

Say file A initially contains functions X, Y and Z, then the prompt prefix is built to include X Y Z. If the user then modifies Y -> Y', then just add that to the context, so that the cached prefix is unchanged, giving X Y Z Y'.