Claude Sonnet 4.6

3 months ago

#AI
#Claude Sonnet
#Machine Learning

Claude Sonnet 4.6 is the most capable Sonnet model yet, with upgrades in coding, computer use, long-context reasoning, and more.
Features a 1M token context window in beta and is now the default model for Free and Pro plans on claude.ai and Claude Cowork.
Pricing remains the same as Sonnet 4.5, starting at $3/$15 per million tokens.
Improved coding skills, consistency, and instruction following make Sonnet 4.6 preferred by developers over its predecessor and even Claude Opus 4.5 in some cases.
Major improvements in computer use skills, with human-level capability in tasks like navigating complex spreadsheets or filling out multi-step web forms.
Sonnet 4.6 shows significant resistance to prompt injection attacks, performing similarly to Opus 4.6 in safety evaluations.
Performance improvements across benchmarks, approaching Opus-level intelligence at a more practical price point.
Early testing shows users prefer Sonnet 4.6 over Sonnet 4.5 70% of the time in Claude Code, with better context reading and logic consolidation.
Sonnet 4.6's 1M token context window allows for effective reasoning across entire codebases, lengthy contracts, or dozens of research papers.
Notable improvements in frontend code, financial analysis, visual outputs, and design sensibility, requiring fewer iterations for production-quality results.
Matches Opus 4.6 performance on OfficeQA, a significant upgrade for document comprehension workloads.
Excels at complex code fixes, bug detection, and agentic coding at scale, with strong resolution rates and consistency.
First Sonnet model to offer frontier-level reasoning in a smaller, more cost-effective form factor.
Significant improvements in answer retrieval, with better recall on specific workflows in the Financial Services Benchmark.
Outperforms Sonnet 4.5 in heavy reasoning Q&A by 15 percentage points in evaluations by Box.
Achieves 94% on the insurance benchmark, the highest-performing model for computer use in mission-critical workflows.
Delivers frontier-level results on complex app builds and bug-fixing, becoming the go-to for deep codebase work.
Produces the best iOS code tested for Rakuten AI, with better spec compliance, architecture, and modern tooling.
Strong performance on branched and multi-step tasks like contract routing, conditional template selection, and CRM coordination.
Perfect design taste in building frontend pages and data reports, requiring less hand-holding.
Exceptionally responsive to direction, delivering precise figures and structured comparisons while generating useful ideas.
Supports adaptive thinking, extended thinking, and context compaction in beta on the Claude Developer Platform.
API updates include automatic code execution for filtering search results, improving response quality and token efficiency.
Opus 4.6 remains the strongest option for tasks requiring the deepest reasoning, such as codebase refactoring and coordinating multiple agents.
Claude in Excel now supports MCP connectors, allowing integration with tools like S&P Global, LSEG, and FactSet.
Available on all Claude plans, Claude Cowork, Claude Code, API, and major cloud platforms, with the free tier upgraded to Sonnet 4.6 by default.

Hasty Briefsbeta

Claude Sonnet 4.6