SAU: Sparsity-Aware Unlearning for LLMs via Gradient Masking and Importance Redistribution
This addresses privacy risks in efficiently deployed sparse LLMs by enabling selective removal of sensitive information, though it is incremental as it adapts unlearning to sparsification.
The paper tackles the problem of machine unlearning on sparse large language models, where existing methods degrade because they update all parameters but sparsification prunes many weights to zero. The proposed Sparsity-Aware Unlearning (SAU) method uses gradient masking and importance redistribution to achieve effective forgetting while preserving model utility, significantly outperforming existing methods on sparse LLMs.
Large Language Models (LLMs) inevitably memorize sensitive information during training, posing significant privacy risks. Machine unlearning has emerged as a promising solution to selectively remove such information without full retraining. However, existing methods are designed for dense models and overlook model sparsification, an essential technique for efficient LLM deployment. We find that unlearning effectiveness degrades substantially on sparse models. Through empirical analysis, we reveal that this degradation occurs because existing unlearning methods require updating all parameters, yet sparsification prunes substantial weights to zero, fundamentally limiting the model's forgetting capacity. To address this challenge, we propose Sparsity-Aware Unlearning (SAU), which decouples unlearning from sparsification objectives through gradient masking that redirects updates to surviving weights, combined with importance-aware redistribution to compensate for pruned parameters. Extensive experiments demonstrate that SAU significantly outperforms existing methods on sparse LLMs, achieving effective forgetting while preserving model utility.