DeepSeekMath-V2: Towards Self-Verifiable Mathematical Reasoning

14 days ago

Copy Link

Large language models (LLMs) have advanced in mathematical reasoning, serving as a testbed for AI and potentially impacting scientific research.
Current approaches focus on final answer accuracy but overlook correct reasoning, which is crucial for tasks like theorem proving.
DeepSeekMath-V2 introduces self-verifiable mathematical reasoning, training an LLM-based verifier to ensure rigorous and comprehensive proofs.
The model uses the verifier as a reward model to train a proof generator, encouraging self-correction before finalizing proofs.
DeepSeekMath-V2 scales verification compute to label hard-to-verify proofs, improving the verifier continuously.
The model achieves top scores in IMO 2025, CMO 2024, and Putnam 2024, demonstrating strong theorem-proving capabilities.
Built on DeepSeek-V3.2-Exp-Base, the model is licensed under Apache 2.0 and available for inference via the DeepSeek-V3.2-Exp GitHub repository.

Hasty Briefsbeta