| ▲ | kanodiaayush 3 hours ago | |
If we use prompt caching - isn't a largish MCP tools section just like a fixed token penalty in return for higher speed at runtime, because tools don't need to be discovered on demand, and that's the better tradeoff? At least for the most powerful models it doesn't feel like their quality goes down much with a few MCP servers. I might be missing something. | ||