| ▲ | 2001zhaozhao 3 hours ago | |
Are you hosting your own infrastructure for coding agents? At least from first glance, sharing actual codebase context across compacts / multiple tasks seems pretty hard to pull off with good cost-benefit unless you have vertical integration from the inference all the way to the coding agent harness. I'm saying this because the current external LLM providers like OpenAI tend to charge quite a bit for longer-term caching, plus the 0.1x cache read cost multiplied by # LLM calls, so I doubt context sharing would actually be that beneficial considering you won't need all the repeated context every time, so caching context results in longer context for each agentic task which might increase API costs by more overall than you save by caching. | ||