| ▲ | user43928 9 hours ago | |||||||
Won't any input be charged uncached, and the output of the small model charged again as uncached input to the bigger model? I don't know whether that comes out ahead compared to just staying with the better model in the first place. | ||||||||
| ▲ | mwigdahl 8 hours ago | parent [-] | |||||||
It's a good question, but for multiturn conversations even cached context adds up quickly. My experience has been that spawning off subagents for defined tasks in a large overall plan generally makes me come out ahead. I'm sure folks' mileage will vary though. | ||||||||
| ||||||||