Instruction-Following Pruning for Large Language Models

11 hours ago

Copy Link

Proposes a dynamic approach to structured pruning for large language models (LLMs) called 'instruction-following pruning'.
Introduces a sparse mask predictor that dynamically selects relevant model parameters based on user instructions.
Jointly optimizes the sparse mask predictor and the LLM using instruction-following data and pre-training corpus.
Demonstrates effectiveness with a 3B activated model outperforming a 3B dense model by 5-8 points in math and coding domains.
Shows performance rivaling a 9B model, highlighting efficiency and superior performance compared to traditional static pruning.

Hasty Briefsbeta