Generating Artificial Data for Private Deep Learning
This addresses privacy concerns for users of deep learning systems, though it appears incremental as it builds on existing generative and privacy methods.
The paper tackles the problem of privacy in deep learning by generating artificial data that retains statistical properties of real data, using a generative adversarial network and an empirical method to assess information disclosure risk, with experiments showing high-quality data generation and successful model training while limiting privacy loss.
In this paper, we propose generating artificial data that retain statistical properties of real data as the means of providing privacy with respect to the original dataset. We use generative adversarial network to draw privacy-preserving artificial data samples and derive an empirical method to assess the risk of information disclosure in a differential-privacy-like way. Our experiments show that we are able to generate artificial data of high quality and successfully train and validate machine learning models on this data while limiting potential privacy loss.