From Memorization to Reasoning in the Spectrum of Loss Curvature
15 days ago
- #loss curvature
- #transformer models
- #memorization
- Memorization in transformer models can be characterized and disentangled using loss landscape curvature.
- Weight editing based on curvature suppresses memorized data more effectively than BalancedSubnet while maintaining lower perplexity.
- Fact retrieval and arithmetic tasks are negatively affected by weight editing, suggesting reliance on specialized weight directions.
- Open book fact retrieval and general logical reasoning remain conserved after editing.
- The study provides insights into memorization and its removal, highlighting task-specific weight structures.