CLAIApr 15

From Anchors to Supervision: Memory-Graph Guided Corpus-Free Unlearning for Large Language Models

arXiv:2604.1377733.9h-index: 15
Predicted impact top 12% in CL · last 90 daysOriginality Highly original
AI Analysis

For LLM developers and regulators, MAGE provides a practical, auditable unlearning workflow that eliminates the need for user-supplied forget sets, reducing privacy risks and abuse potential.

MAGE enables corpus-free unlearning in LLMs by using only a lightweight user anchor to recover target-related memorization, build a memory graph, and synthesize supervision, achieving unlearning performance comparable to methods using external reference on TOFU and RWKU benchmarks.

Large language models (LLMs) may memorize sensitive or copyrighted content, raising significant privacy and legal concerns. While machine unlearning has emerged as a potential remedy, prevailing paradigms rely on user-provided forget sets, making unlearning requests difficult to audit and exposing systems to secondary leakage and malicious abuse. We propose MAGE, a Memory-grAph Guided Erasure framework for user-minimized, corpus-free unlearning. Given only a lightweight user anchor that identifies a target entity, MAGE probes the target LLM to recover target-related memorization, organizes it into a weighted local memory graph, and synthesizes scoped supervision for unlearning. MAGE is model-agnostic, can be plugged into standard unlearning methods, and requires no access to the original training corpus. Experiments on two benchmarks, TOFU and RWKU, demonstrate that MAGE's self-generated supervision achieves effective unlearning performance comparable to supervision generated with external reference, while preserving overall utility. These results support a practical and auditable unlearning workflow driven by minimal anchors rather than user-supplied forget corpora.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes