VibeThinker-1.5B

10 days ago

Copy Link

VibeThinker-1.5B is a 1.5B-parameter dense model challenging the notion that small models lack robust reasoning capabilities.
It uses the innovative 'Spectrum-to-Signal Principle (SSP)' post-training methodology.
Outperforms closed-source models like Magistral Medium and Claude Opus 4, and matches open-source models like GPT OSS-20B Medium.
Surpasses the 400x larger DeepSeek R1 model on mathematical benchmarks AIME24, AIME25, and HMMT25.
Ultra-efficient: achieves state-of-the-art performance in math and coding tasks with only 1.5B parameters.
Innovative methodology includes 'Two-Stage Diversity-Exploring Distillation' and 'MaxEnt-Guided Policy Optimization (MGPO)'.
Cost-effective: post-training costs just $7,800, compared to $294K-$535K for competitors.
Model weights and technical report are open-sourced and available on Hugging Face and ModelScope.
Recommended for competitive-style math and coding problems with specific parameter settings.
Includes a code snippet for using the model with transformers.

Hasty Briefsbeta