Generative Models with Information-Theoretic Protection Against Membership Inference Attacks
This addresses privacy risks in generative models for data synthesis, offering a practical solution with incremental improvements over existing methods.
The paper tackles the problem of generative models leaking private training data via membership inference attacks by proposing an information-theoretic regularization term that prevents overfitting and encourages generalizability. The result shows that with this low-cost regularization, GANs preserve privacy, generate high-quality samples, and achieve better downstream classification performance compared to non-private and differentially private models.
Deep generative models, such as Generative Adversarial Networks (GANs), synthesize diverse high-fidelity data samples by estimating the underlying distribution of high dimensional data. Despite their success, GANs may disclose private information from the data they are trained on, making them susceptible to adversarial attacks such as membership inference attacks, in which an adversary aims to determine if a record was part of the training set. We propose an information theoretically motivated regularization term that prevents the generative model from overfitting to training data and encourages generalizability. We show that this penalty minimizes the JensenShannon divergence between components of the generator trained on data with different membership, and that it can be implemented at low cost using an additional classifier. Our experiments on image datasets demonstrate that with the proposed regularization, which comes at only a small added computational cost, GANs are able to preserve privacy and generate high-quality samples that achieve better downstream classification performance compared to non-private and differentially private generative models.