LGCLOct 29, 2024

Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate

arXiv:2410.22086v330 citationsh-index: 61NAACL
Originality Incremental advance
AI Analysis

This work addresses the need to remove unwanted knowledge from LLMs, which is an incremental improvement in the domain of machine unlearning.

The paper tackles the problem of machine unlearning in large language models by framing it as a multi-task optimization with a forgetting objective and model performance objective, introducing the NGDiff algorithm with an adaptive learning rate, and demonstrates superior performance on TOFU and MUSE datasets with stable training.

Machine unlearning has been used to remove unwanted knowledge acquired by large language models (LLMs). In this paper, we examine machine unlearning from an optimization perspective, framing it as a regularized multi-task optimization problem, where one task optimizes a forgetting objective and another optimizes the model performance. In particular, we introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives, while integrating a new, automatic learning rate scheduler. We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets while exhibiting stable training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes