Hasty Briefsbeta

VibeThinker-1.5B

10 days ago
  • #AI
  • #Machine Learning
  • #Reasoning Models
  • VibeThinker-1.5B is a 1.5B-parameter dense model challenging the notion that small models lack robust reasoning capabilities.
  • It uses the innovative 'Spectrum-to-Signal Principle (SSP)' post-training methodology.
  • Outperforms closed-source models like Magistral Medium and Claude Opus 4, and matches open-source models like GPT OSS-20B Medium.
  • Surpasses the 400x larger DeepSeek R1 model on mathematical benchmarks AIME24, AIME25, and HMMT25.
  • Ultra-efficient: achieves state-of-the-art performance in math and coding tasks with only 1.5B parameters.
  • Innovative methodology includes 'Two-Stage Diversity-Exploring Distillation' and 'MaxEnt-Guided Policy Optimization (MGPO)'.
  • Cost-effective: post-training costs just $7,800, compared to $294K-$535K for competitors.
  • Model weights and technical report are open-sourced and available on Hugging Face and ModelScope.
  • Recommended for competitive-style math and coding problems with specific parameter settings.
  • Includes a code snippet for using the model with transformers.