LGAICRMay 13, 2021

DeepObliviate: A Powerful Charm for Erasing Data Residual Memory in Deep Neural Networks

arXiv:2105.06209v134 citations
Originality Incremental advance
AI Analysis

This addresses the need for machine unlearning to protect user privacy and meet legal requirements, offering a more efficient solution than retraining from scratch, though it is incremental in improving existing unlearning approaches.

The paper tackles the problem of efficiently erasing data from deep neural networks to comply with privacy regulations, proposing DeepObliviate, which achieves high accuracy rates (e.g., 99.0% on MNIST) and significant speedups (e.g., 66.7x on MNIST) compared to retraining from scratch, while improving over state-of-the-art methods by 5.8% accuracy and 32.5x prediction speedup on average.

Machine unlearning has great significance in guaranteeing model security and protecting user privacy. Additionally, many legal provisions clearly stipulate that users have the right to demand model providers to delete their own data from training set, that is, the right to be forgotten. The naive way of unlearning data is to retrain the model without it from scratch, which becomes extremely time and resource consuming at the modern scale of deep neural networks. Other unlearning approaches by refactoring model or training data struggle to gain a balance between overhead and model usability. In this paper, we propose an approach, dubbed as DeepObliviate, to implement machine unlearning efficiently, without modifying the normal training mode. Our approach improves the original training process by storing intermediate models on the hard disk. Given a data point to unlearn, we first quantify its temporal residual memory left in stored models. The influenced models will be retrained and we decide when to terminate the retraining based on the trend of residual memory on-the-fly. Last, we stitch an unlearned model by combining the retrained models and uninfluenced models. We extensively evaluate our approach on five datasets and deep learning models. Compared to the method of retraining from scratch, our approach can achieve 99.0%, 95.0%, 91.9%, 96.7%, 74.1% accuracy rates and 66.7$\times$, 75.0$\times$, 33.3$\times$, 29.4$\times$, 13.7$\times$ speedups on the MNIST, SVHN, CIFAR-10, Purchase, and ImageNet datasets, respectively. Compared to the state-of-the-art unlearning approach, we improve 5.8% accuracy, 32.5$\times$ prediction speedup, and reach a comparable retrain speedup under identical settings on average on these datasets. Additionally, DeepObliviate can also pass the backdoor-based unlearning verification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes