| ▲ | Jgrubb 2 hours ago | |
The tokens are still being burnt, they're just doing so in a parallel dimension from the users main context window. | ||
| ▲ | ajmurmann an hour ago | parent | next [-] | |
It's true that the initial tool response still has the same amount of tokens but it doesn't keep dragged along in the longer-lived top context. | ||
| ▲ | ViewTrick1002 2 hours ago | parent | prev [-] | |
The real benefit is being able to use a cheaper, but good enough, model with a specific system prompt dedicated to that task. | ||