LG CR MLJun 29, 2022

Approximate Data Deletion in Generative Models

arXiv:2206.14439v15.88 citationsh-index: 9

Originality Incremental advance

AI Analysis

This addresses the need for compliance with data privacy regulations like GDPR and CCPA by providing a computationally efficient solution for users to delete their data from generative models, though it is incremental as it builds on existing supervised learning deletion methods.

The paper tackles the problem of efficient data deletion in generative models, which is an open issue in unsupervised learning, by proposing a density-ratio-based framework that enables fast approximate deletion and statistical verification, with empirical demonstrations across various generative methods.

Users have the right to have their data deleted by third-party learned systems, as codified by recent legislation such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA). Such data deletion can be accomplished by full re-training, but this incurs a high computational cost for modern machine learning models. To avoid this cost, many approximate data deletion methods have been developed for supervised learning. Unsupervised learning, in contrast, remains largely an open problem when it comes to (approximate or exact) efficient data deletion. In this paper, we propose a density-ratio-based framework for generative models. Using this framework, we introduce a fast method for approximate data deletion and a statistical test for estimating whether or not training points have been deleted. We provide theoretical guarantees under various learner assumptions and empirically demonstrate our methods across a variety of generative methods.

View on arXiv PDF

Similar