LGCRCYMLJul 23, 2020

Private Post-GAN Boosting

arXiv:2007.11934v230 citations
AI Analysis

This addresses the challenge of generating realistic synthetic data while preserving privacy, offering an incremental improvement for applications in data sharing and machine learning with sensitive information.

The paper tackles the problem of poor utility in differentially private GANs due to training noise by proposing Private post-GAN boosting, which combines samples from generators during training to create high-quality synthetic datasets. The method improves upon standard private GANs across quality measures on toy data, MNIST, US Census data, and a prediction task.

Differentially private GANs have proven to be a promising approach for generating realistic synthetic data without compromising the privacy of individuals. Due to the privacy-protective noise introduced in the training, the convergence of GANs becomes even more elusive, which often leads to poor utility in the output generator at the end of training. We propose Private post-GAN boosting (Private PGB), a differentially private method that combines samples produced by the sequence of generators obtained during GAN training to create a high-quality synthetic dataset. To that end, our method leverages the Private Multiplicative Weights method (Hardt and Rothblum, 2010) to reweight generated samples. We evaluate Private PGB on two dimensional toy data, MNIST images, US Census data and a standard machine learning prediction task. Our experiments show that Private PGB improves upon a standard private GAN approach across a collection of quality measures. We also provide a non-private variant of PGB that improves the data quality of standard GAN training.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes