Hasty Briefsbeta

Bilingual

Apple: Embarrassingly Simple Self-Distillation Improves Code Generation

6 hours ago
  • #LLMs
  • #code-generation
  • #self-distillation
  • Simple self-distillation (SSD) improves code generation in LLMs by fine-tuning on the model's own raw outputs without external verifiers or reinforcement learning.
  • SSD significantly boosts performance; e.g., Qwen3-30B-Instruct's pass@1 on LiveCodeBench v6 improved from 42.4% to 55.3%, especially on harder problems.
  • The method works by addressing a precision-exploration conflict, reshaping token distributions to suppress distractors while preserving diversity.
  • SSD generalizes across Qwen and Llama models at various scales (4B, 8B, 30B) and across instruct and thinking variants.
  • It offers a complementary post-training direction for enhancing LLM code generation without complex setups.