LGCLFeb 13

Quantization-Robust LLM Unlearning via Low-Rank Adaptation

arXiv:2602.13151v11 citationsh-index: 2
AI Analysis

This addresses a practical issue for deploying unlearned models efficiently, but is incremental as it adapts an existing method (LoRA) to a specific bottleneck.

The paper tackles the problem of large language model unlearning being degraded by post-training quantization, and shows that using low-rank adaptation (LoRA) improves 4-bit utility by up to 7.93 points and reduces privacy leakage, e.g., from -25.68 to -5.86, while maintaining strong forgetting.

Large Language Model (LLM) unlearning aims to remove targeted knowledge from a trained model, but practical deployments often require post-training quantization (PTQ) for efficient inference. However, aggressive low-bit PTQ can mask or erase unlearning updates, causing quantized models to revert to pre-unlearning behavior. We show that standard full-parameter fine-tuning often induce parameter changes that are too small to survive 4-bit quantization. We propose quantization-robust unlearning via low-rank adaptation (LoRA): we freeze the base model and concentrate unlearning into trainable adapters so that the effective update is preserved after quantization. On Llama-2-7B evaluated with MUSE dataset (BOOKS and NEWS), LoRA improves 4-bit utility by up to 7.93 points (NPO+GDR on BOOKS: 50.17 to 58.10) and yields higher 4-bit utility on NEWS for GA+GDR (40.06 to 44.82, increase of 4.76). LoRA also substantially reduces privacy leakage under 4-bit PTQ, e.g., for GA+KLR on BOOKS, PrivLeak moves from -25.68 to -5.86 (closer to ideal 0), while maintaining strong forgetting (VerMem and KnowMem near 0). Thus, using LoRA for Machine Unlearning is beneficial for scenarios where quantization is necessary for model deployment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes