LG AI CROct 21, 2020

Amnesiac Machine Learning

Laura Graves, Vineel Nagisetty, Vijay Ganesh

arXiv:2010.10981v135.7430 citations

Originality Incremental advance

AI Analysis

This addresses the regulatory compliance and privacy protection needs for model owners handling EU resident data, offering a solution to a specific legal and security challenge.

The paper tackles the problem of ensuring machine learning models comply with GDPR's Right to be Forgotten by removing personal data without vulnerability to model inversion and membership inference attacks, presenting two efficient methods that effectively remove sensitive information while maintaining model efficacy.

The Right to be Forgotten is part of the recently enacted General Data Protection Regulation (GDPR) law that affects any data holder that has data on European Union residents. It gives EU residents the ability to request deletion of their personal data, including training records used to train machine learning models. Unfortunately, Deep Neural Network models are vulnerable to information leaking attacks such as model inversion attacks which extract class information from a trained model and membership inference attacks which determine the presence of an example in a model's training data. If a malicious party can mount an attack and learn private information that was meant to be removed, then it implies that the model owner has not properly protected their user's rights and their models may not be compliant with the GDPR law. In this paper, we present two efficient methods that address this question of how a model owner or data holder may delete personal data from models in such a way that they may not be vulnerable to model inversion and membership inference attacks while maintaining model efficacy. We start by presenting a real-world threat model that shows that simply removing training data is insufficient to protect users. We follow that up with two data removal methods, namely Unlearning and Amnesiac Unlearning, that enable model owners to protect themselves against such attacks while being compliant with regulations. We provide extensive empirical analysis that show that these methods are indeed efficient, safe to apply, effectively remove learned information about sensitive data from trained models while maintaining model efficacy.

View on arXiv PDF

Similar