Remix clone Hacker News

new | show | ask | jobs Github

	▲	algis-hn 6 hours ago
		Accurate for naive MCP client implementations, but a proxy layer with inference-time routing solves exactly this control problem. BM25 semantic matching on each incoming query exposes only 3-5 relevant tool schemas to the agent rather than loading everything upfront - the 44K token cold-start cost that the article cites mostly disappears because the routing layer is doing selection work. MCPProxy (https://github.com/smart-mcp-proxy/mcpproxy-go) implements this pattern: structured schemas stay for validation and security quarantine, but the agent only sees what's relevant per query rather than the full catalog. The tradeoff isn't MCP vs CLI - it's routing-aware MCP vs naive MCP, and the former competes with CLI on token efficiency while retaining the organizational benefits the article argues for.