Hasty Briefsbeta

Utility-Preserving, Robust, and Almost Irreversible Forgetting in LLMs

13 hours ago
  • #Data Privacy
  • #Machine Learning
  • #Large Language Models
  • Introduction of JensUn, a new unlearning method for LLMs using Jensen-Shannon Divergence for better stability and effectiveness.
  • JensUn outperforms existing methods in achieving a better forget-utility trade-off and shows resilience to benign relearning.
  • Creation of LKF, a dataset of lesser-known facts, to provide a realistic scenario for evaluating unlearning methods.
  • Proposal of an improved evaluation framework using an LLM as a semantic judge and worst-case evaluation over various paraphrases and input formats.
  • Findings that many existing unlearning methods are less effective than previously thought under the new evaluation framework.