Self-Distillation Enables Continual Learning [pdf]
5 hours ago
- #self-distillation
- #foundation-models
- #continual-learning
- Self-Distillation Fine-Tuning (SDFT) enables on-policy learning from expert demonstrations for continual learning.
- SDFT uses a demonstration-conditioned model as its own teacher to generate on-policy training signals.
- This method outperforms supervised fine-tuning (SFT) by achieving higher new-task accuracy and reducing catastrophic forgetting.
- In sequential learning, SDFT allows a single model to accumulate multiple skills over time without performance regression.