| ▲ | gertjandewilde 6 hours ago | |
We built a unified API with a large surface area and ran into a problem when building our MCP server: tool definitions alone burned 50,000+ tokens before the agent touched a single user message. The fix that worked for us was giving agents a CLI instead. ~80 tokens in the system prompt, progressive discovery through --help, and permission enforcement baked into the binary rather than prompts. The post covers the benchmarks (Scalekit's 75-run comparison showed 4-32x token overhead for MCP vs CLI), the architecture, and an honest section on where CLIs fall short (streaming, delegated auth, distribution). | ||