DeepSeek-v3.2-Exp
a day ago
- #DeepSeek
- #AI
- #Machine Learning
- DeepSeek-V3.2-Exp is the latest experimental model based on V3.1-Terminus.
- Introduces DeepSeek Sparse Attention (DSA) for faster and more efficient training and inference on long context.
- Available on App, Web, and API with API prices reduced by 50%+.
- DSA improves long-context performance and reduces compute costs with minimal impact on output quality.
- Benchmarks show V3.2-Exp performs similarly to V3.1-Terminus.
- V3.1-Terminus remains available via a temporary API until Oct 15th, 2025, for comparison testing.
- Feedback on DSA is encouraged via the provided link.
- Model and tech report are open-sourced on Hugging Face and GitHub.
- Includes key GPU kernels in TileLang & CUDA, with TileLang recommended for rapid prototyping.