LGJul 13, 2025

Leveraging Distribution Matching to Make Approximate Machine Unlearning Faster

arXiv:2507.09786v3h-index: 1Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of slow unlearning for machine learning practitioners, offering incremental improvements in efficiency through novel techniques.

The paper tackles the computational inefficiency of approximate machine unlearning by proposing two methods: Blend for dataset condensation to reduce retained set size, and A-AMU for loss optimization to speed up convergence, resulting in dramatically reduced unlearning latency while preserving model utility and privacy.

Approximate machine unlearning (AMU) enables models to `forget' specific training data through specialized fine-tuning on a retained (and forget) subset of training set. However, processing this large retained subset still dominates computational runtime, while reductions of unlearning epochs also remain a challenge. In this paper, we propose two complementary methods to accelerate arbitrary classification-oriented AMU method. First, \textbf{Blend}, a novel distribution-matching dataset condensation (DC), merges visually similar images with shared blend-weights to significantly reduce the retained set size. It operates with minimal pre-processing overhead and is orders of magnitude faster than state-of-the-art DC methods. Second, our loss-centric method, \textbf{Accelerated-AMU (A-AMU)}, augments the AMU objective to quicken convergence. A-AMU achieves this by combining a steepened primary loss to expedite forgetting with a differentiable regularizer that matches the loss distributions of forgotten and in-distribution unseen data. Our extensive experiments demonstrate that this dual approach of data and loss-centric optimization dramatically reduces end-to-end unlearning latency across both single and multi-round scenarios, all while preserving model utility and privacy. To our knowledge, this is the first work to systematically tackle unlearning efficiency by jointly designing a specialized dataset condensation technique with a dedicated accelerated loss function. Code is available at https://github.com/algebraicdianuj/DC_Unlearning.

View on arXiv PDF Code

Similar