DeepSeek V4 Pro at 5% the cost of Claude – what it takes to close the gap

6 hours ago

#Harness Design
#AI Coding Assistants
#DeepSeek V4 Pro

DeepSeek V4 Pro is significantly cheaper than Claude Sonnet 4, costing about 5-7 times less across input, cache, and output tokens.
The performance gap between V4 Pro and Claude is reduced to about 90% in real workflows with proper harness design, despite benchmark numbers showing 80-85%.
V4 Pro struggles with long-horizon planning in unfamiliar codebases, reading sloppy code, and first-shot UI work compared to Claude.
V4 Pro excels at following precise specs, writing numerical/scientific code, and Bash/ops tasks, making it effective for targeted use cases.
Hash-anchored editing, inspired by Akay's hashline pattern, reduces retries and token usage by allowing edits via line references instead of exact string matching.
Optimizing the DeepSeek cache through stable system prompts, stripping reasoning content, and deterministic tool serialization can achieve high hit ratios, cutting costs.
Implementing features like storm-breaker (synthesized error responses), Plan mode (read-only planning), and Rewind (content-addressed snapshots) enhances autonomous loop usability.
The cwcode harness, built in Go with a Sink interface for flexibility, is used in production for tasks like dose-prediction modeling, financial research, and self-improvement.
Lessons from harness development include fixing bugs like terminal input handling and command popup issues through rapid iteration and user feedback.
Harness design is critical for leveraging V4 Pro's cost-effectiveness, with key practices including hash-anchored edits, cache optimization, and user-friendly error handling.

Hasty Briefsbeta

DeepSeek V4 Pro at 5% the cost of Claude – what it takes to close the gap