LGSep 8, 2021

EMA: Auditing Data Removal from Trained Models

arXiv:2109.03675v216 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses data privacy and compliance issues for machine learning practitioners, though it is incremental as it builds on existing auditing methods.

The paper tackles the problem of verifying data removal from trained models by proposing Ensembled Membership Auditing (EMA), which overcomes limitations of a prior KS distance-based method and shows robustness on benchmark datasets like MNIST, SVHN, and Chest X-ray with MLPs and CNNs.

Data auditing is a process to verify whether certain data have been removed from a trained model. A recently proposed method (Liu et al. 20) uses Kolmogorov-Smirnov (KS) distance for such data auditing. However, it fails under certain practical conditions. In this paper, we propose a new method called Ensembled Membership Auditing (EMA) for auditing data removal to overcome these limitations. We compare both methods using benchmark datasets (MNIST and SVHN) and Chest X-ray datasets with multi-layer perceptrons (MLP) and convolutional neural networks (CNN). Our experiments show that EMA is robust under various conditions, including the failure cases of the previously proposed method. Our code is available at: https://github.com/Hazelsuko07/EMA.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes