Self-Adapting Language Models

a year ago

Introduces Self-Adapting LLMs (SEAL), a framework enabling large language models (LLMs) to self-adapt by generating their own finetuning data and update directives.
SEAL allows models to produce self-edits that can restructure information, specify optimization hyperparameters, or invoke tools for data augmentation and gradient-based updates.
Uses supervised finetuning (SFT) for persistent weight updates, enabling lasting adaptation.
Trains the model with a reinforcement learning loop, using downstream performance of the updated model as the reward signal.
Unlike prior approaches, SEAL directly uses the model's own generation to control its adaptation process.
Experiments show SEAL's effectiveness in knowledge incorporation and few-shot generalization, marking a step toward self-directed adaptation in language models.

Hasty Briefsbeta