DeepSeek V4 Pro at 5% the cost of Claude – what it takes to close the gap
6 hours ago
- #Harness Design
- #AI Coding Assistants
- #DeepSeek V4 Pro
- DeepSeek V4 Pro is significantly cheaper than Claude Sonnet 4, costing about 5-7 times less across input, cache, and output tokens.
- The performance gap between V4 Pro and Claude is reduced to about 90% in real workflows with proper harness design, despite benchmark numbers showing 80-85%.
- V4 Pro struggles with long-horizon planning in unfamiliar codebases, reading sloppy code, and first-shot UI work compared to Claude.
- V4 Pro excels at following precise specs, writing numerical/scientific code, and Bash/ops tasks, making it effective for targeted use cases.
- Hash-anchored editing, inspired by Akay's hashline pattern, reduces retries and token usage by allowing edits via line references instead of exact string matching.
- Optimizing the DeepSeek cache through stable system prompts, stripping reasoning content, and deterministic tool serialization can achieve high hit ratios, cutting costs.
- Implementing features like storm-breaker (synthesized error responses), Plan mode (read-only planning), and Rewind (content-addressed snapshots) enhances autonomous loop usability.
- The cwcode harness, built in Go with a Sink interface for flexibility, is used in production for tasks like dose-prediction modeling, financial research, and self-improvement.
- Lessons from harness development include fixing bugs like terminal input handling and command popup issues through rapid iteration and user feedback.
- Harness design is critical for leveraging V4 Pro's cost-effectiveness, with key practices including hash-anchored edits, cache optimization, and user-friendly error handling.