Is One Layer Enough? A Single Transformer Layer Matches Full-Parameter RL Train
9 hours ago
- #transformer-architecture
- #reinforcement-learning
- #model-efficiency
- RL gains during post-training of large language models are highly concentrated in a small subset of transformer layers.
- Training just one transformer layer can recover most, and sometimes even surpass, the performance gains of full-parameter RL training.
- High-contribution layers are consistently found in the middle of the transformer stack, while input and output layers contribute less.
- This pattern is stable across multiple models, RL algorithms, and task domains like mathematical reasoning and code generation.