TinyLoRA – Learning to Reason in 13 Parameters

5 days ago

The paper introduces TinyLoRA, a method enabling low-rank adapters with as few as one parameter.
TinyLoRA trains an 8B parameter Qwen2.5 model to achieve 91% accuracy on GSM8K using only 13 parameters (26 bytes in bf16).
Performance improvements reach 90% of baseline with 1000x fewer parameters across benchmarks like AIME, AMC, and MATH500.
Reinforcement learning (RL) is essential; supervised fine-tuning (SFT) requires 100-1000x larger updates for similar results.

Hasty Briefsbeta