CRLGMay 22, 2017

LOGAN: Membership Inference Attacks Against Generative Models

arXiv:1705.07663v4107 citations
Originality Highly original
AI Analysis

This work exposes a critical privacy vulnerability in generative AI systems that could compromise sensitive training data, particularly for domains like medical imaging.

The authors introduced the first membership inference attacks against generative models, enabling adversaries to determine whether specific data points were used in training. Their attacks achieved high accuracy (up to 98% on LFW dataset) against state-of-the-art generative models across multiple domains including faces, objects, and medical images.

Generative models estimate the underlying distribution of a dataset to generate realistic samples according to that distribution. In this paper, we present the first membership inference attacks against generative models: given a data point, the adversary determines whether or not it was used to train the model. Our attacks leverage Generative Adversarial Networks (GANs), which combine a discriminative and a generative model, to detect overfitting and recognize inputs that were part of training datasets, using the discriminator's capacity to learn statistical differences in distributions. We present attacks based on both white-box and black-box access to the target model, against several state-of-the-art generative models, over datasets of complex representations of faces (LFW), objects (CIFAR-10), and medical images (Diabetic Retinopathy). We also discuss the sensitivity of the attacks to different training parameters, and their robustness against mitigation strategies, finding that defenses are either ineffective or lead to significantly worse performances of the generative models in terms of training stability and/or sample quality.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes