LGMLJan 27, 2020

DP-CGAN: Differentially Private Synthetic Data and Label Generation

arXiv:2001.09700v1276 citations
AI Analysis

This addresses privacy concerns for research communities using sensitive datasets, though it appears incremental as it builds on existing GAN and differential privacy methods.

The paper tackles the problem of preserving individual privacy in GAN training for synthetic data generation by introducing DP-CGAN, a differentially private conditional GAN framework that improves performance while maintaining privacy, achieving promising results on MNIST with a single-digit epsilon parameter.

Generative Adversarial Networks (GANs) are one of the well-known models to generate synthetic data including images, especially for research communities that cannot use original sensitive datasets because they are not publicly accessible. One of the main challenges in this area is to preserve the privacy of individuals who participate in the training of the GAN models. To address this challenge, we introduce a Differentially Private Conditional GAN (DP-CGAN) training framework based on a new clipping and perturbation strategy, which improves the performance of the model while preserving privacy of the training dataset. DP-CGAN generates both synthetic data and corresponding labels and leverages the recently introduced Renyi differential privacy accountant to track the spent privacy budget. The experimental results show that DP-CGAN can generate visually and empirically promising results on the MNIST dataset with a single-digit epsilon parameter in differential privacy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes