LG AIAug 25, 2025

Data Augmentation Improves Machine Unlearning

Andreza M. C. Falcao, Filipe R. Cordeiro

arXiv:2508.18502v11 citationsh-index: 1SIBGRAPI

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of efficiently removing specific data influences from trained models for privacy and performance, though it appears incremental by applying existing augmentation methods to an under-investigated area.

The paper tackles the problem of improving machine unlearning by exploring data augmentation strategies, showing that proper augmentation design can reduce the performance gap to retrained models by up to 40.12% in terms of the Average Gap unlearning Metric on datasets like CIFAR-10 and CIFAR-100.

Machine Unlearning (MU) aims to remove the influence of specific data from a trained model while preserving its performance on the remaining data. Although a few works suggest connections between memorisation and augmentation, the role of systematic augmentation design in MU remains under-investigated. In this work, we investigate the impact of different data augmentation strategies on the performance of unlearning methods, including SalUn, Random Label, and Fine-Tuning. Experiments conducted on CIFAR-10 and CIFAR-100, under varying forget rates, show that proper augmentation design can significantly improve unlearning effectiveness, reducing the performance gap to retrained models. Results showed a reduction of up to 40.12% of the Average Gap unlearning Metric, when using TrivialAug augmentation. Our results suggest that augmentation not only helps reduce memorization but also plays a crucial role in achieving privacy-preserving and efficient unlearning.

View on arXiv PDF

Similar