MCP Server Is Eating Your Context Window. There's a Simpler Way

2 months ago

MCP servers consume a significant portion of the context window, with tool definitions taking up to 72% of Claude's 200k token limit.
Three approaches to context bloat: MCP with compression tricks, code execution (Duet approach), and CLI as the agent interface.
CLI approach offers progressive disclosure, saving tokens by loading only necessary information on demand (~80 tokens upfront vs. 10,000+ for MCP).
CLI agents are more reliable with local execution, avoiding remote server failures (28% failure rate in MCP).
Structural safety in CLI enforces permissions at the binary level, unlike prompt-based safety in MCP.
CLI offers universal compatibility with minimal setup, while MCP requires dedicated client support and connection management.
CLI is not ideal for high-frequency tools, complex workflows, or scenarios requiring OAuth and user consent flows.
API providers should consider progressive disclosure, structural safety, and machine-friendly output formats for AI agents.

Hasty Briefsbeta