Hasty Briefsbeta

Bilingual

TinyLoRA – Learning to Reason in 13 Parameters

5 days ago
  • #Reasoning in Language Models
  • #Low-Rank Adaptation
  • #Reinforcement Learning
  • The paper introduces TinyLoRA, a method enabling low-rank adapters with as few as one parameter.
  • TinyLoRA trains an 8B parameter Qwen2.5 model to achieve 91% accuracy on GSM8K using only 13 parameters (26 bytes in bf16).
  • Performance improvements reach 90% of baseline with 1000x fewer parameters across benchmarks like AIME, AMC, and MATH500.
  • Reinforcement learning (RL) is essential; supervised fine-tuning (SFT) requires 100-1000x larger updates for similar results.