Hasty Briefsbeta

Bilingual

M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

a year ago
  • #Mamba Architecture
  • #Reasoning Models
  • #Machine Learning
  • Introduces M1, a hybrid linear RNN reasoning model based on the Mamba architecture for memory-efficient inference.
  • Leverages distillation from existing reasoning models and RL training to enhance performance.
  • Outperforms previous linear RNN models and matches state-of-the-art Deepseek R1 distilled reasoning models on AIME and MATH benchmarks.
  • Achieves more than 3x speedup compared to same-size transformers when using vLLM, enabling higher accuracy under fixed generation time budgets.
  • Proposes an effective approach to scaling test-time generation using self-consistency or long chain-of-thought reasoning.