VibeThinker-1.5B
10 days ago
- #AI
- #Machine Learning
- #Reasoning Models
- VibeThinker-1.5B is a 1.5B-parameter dense model challenging the notion that small models lack robust reasoning capabilities.
- It uses the innovative 'Spectrum-to-Signal Principle (SSP)' post-training methodology.
- Outperforms closed-source models like Magistral Medium and Claude Opus 4, and matches open-source models like GPT OSS-20B Medium.
- Surpasses the 400x larger DeepSeek R1 model on mathematical benchmarks AIME24, AIME25, and HMMT25.
- Ultra-efficient: achieves state-of-the-art performance in math and coding tasks with only 1.5B parameters.
- Innovative methodology includes 'Two-Stage Diversity-Exploring Distillation' and 'MaxEnt-Guided Policy Optimization (MGPO)'.
- Cost-effective: post-training costs just $7,800, compared to $294K-$535K for competitors.
- Model weights and technical report are open-sourced and available on Hugging Face and ModelScope.
- Recommended for competitive-style math and coding problems with specific parameter settings.
- Includes a code snippet for using the model with transformers.