Hasty Briefsbeta

Bilingual

Self-Distillation Enables Continual Learning [pdf]

5 hours ago
  • #self-distillation
  • #foundation-models
  • #continual-learning
  • Self-Distillation Fine-Tuning (SDFT) enables on-policy learning from expert demonstrations for continual learning.
  • SDFT uses a demonstration-conditioned model as its own teacher to generate on-policy training signals.
  • This method outperforms supervised fine-tuning (SFT) by achieving higher new-task accuracy and reducing catastrophic forgetting.
  • In sequential learning, SDFT allows a single model to accumulate multiple skills over time without performance regression.