Hasty Briefsbeta

DeepSeek-v3.1

3 days ago
  • #DeepSeek
  • #AI
  • #Language Model
  • DeepSeek-V3.1 is a hybrid model supporting both thinking and non-thinking modes.
  • Improvements include smarter tool calling, higher thinking efficiency, and hybrid thinking mode.
  • The model is trained with extended long-context data (32K and 128K phases).
  • Supports tool calls in non-thinking mode and specific formats for search and code agents.
  • Evaluation shows superior performance in general, search agent, code, and math benchmarks.
  • Includes detailed chat templates for both thinking and non-thinking modes.
  • Available for download with 128K context length and 37B activated parameters.
  • Usage examples and local running instructions are provided.
  • Licensed under MIT License and includes citation details.