| ▲ | rox_kd 19 hours ago | ||||||||||||||||
In what settings do you mean - there are multiple strategies, I think building your own compaction layer in front seems a bit over-kill ? have you considered implementing some cache strategy, otherwise summary pipelines - I made once an agent which based on the messages routed things to a smaller model for compaction / summaries to bring down the context, for the main agent. But also ensuring you start new fresh context threads, instead of banging through a single one untill your whole feature is done .. working in small atomic incrementals works pretty good | |||||||||||||||||
| ▲ | bhaviav100 19 hours ago | parent [-] | ||||||||||||||||
yes, compaction and smaller models help on cost per step. But my issue wasn’t just inefficiency, it was agents retrying when they shouldn’t. I needed visibility + limits per agent/task, and the ability to cut it off, not just optimize it. | |||||||||||||||||
| |||||||||||||||||