DeepSeek-v3.1
3 days ago
- #DeepSeek
- #AI
- #Language Model
- DeepSeek-V3.1 is a hybrid model supporting both thinking and non-thinking modes.
- Improvements include smarter tool calling, higher thinking efficiency, and hybrid thinking mode.
- The model is trained with extended long-context data (32K and 128K phases).
- Supports tool calls in non-thinking mode and specific formats for search and code agents.
- Evaluation shows superior performance in general, search agent, code, and math benchmarks.
- Includes detailed chat templates for both thinking and non-thinking modes.
- Available for download with 128K context length and 37B activated parameters.
- Usage examples and local running instructions are provided.
- Licensed under MIT License and includes citation details.