I traced 3,177 API calls to see what 4 AI coding tools put in the context window
6 days ago
- #context window analysis
- #token efficiency
- #AI coding tools
- The author built Context Lens to analyze how different AI coding tools use tokens in their context windows.
- Four tools (Claude Opus, Claude Sonnet, Codex, Gemini) were tested with the same bug-fixing task in an Express.js repository.
- All tools successfully fixed the bug but used vastly different token counts: Opus (23K-35K), Sonnet (43K-70K), Codex (29K-47K), Gemini (179K-350K).
- Opus was the most efficient, using git history to pinpoint the bug with minimal code reading but carried a heavy 'tool definition' overhead (69% of context).
- Sonnet took a thorough approach, reading test files and source code, resulting in more balanced context usage but higher token counts.
- Codex used Unix-like commands (grep, sed) for targeted code reading, making it predictable and efficient with low tool definition overhead (6%).
- Gemini had no tool definition overhead but aggressively consumed context by dumping entire files and git histories (96% tool results), with highly variable token usage.
- None of the tools actively managed their context budget; efficiency differences came from investigation strategies rather than deliberate optimization.
- Context Lens is open-source and provides real-time analysis of LLM API calls, helping developers understand token usage.