Differentially Private Generative Adversarial Networks for Time Series, Continuous, and Discrete Open Data
This addresses the need for privacy-preserving data publishing in various domains, offering a flexible solution for generating synthetic data while protecting user individuality, though it builds incrementally on existing deep learning methods.
The paper tackles the problem of generating high-quality open datasets with privacy guarantees by introducing a differentially private generative adversarial network framework that works for time series, continuous, and discrete data, demonstrating its efficiency on real and benchmark datasets.
Open data plays a fundamental role in the 21th century by stimulating economic growth and by enabling more transparent and inclusive societies. However, it is always difficult to create new high-quality datasets with the required privacy guarantees for many use cases. This paper aims at creating a framework for releasing new open data while protecting the individuality of the users through a strict definition of privacy called differential privacy. Unlike previous work, this paper provides a framework for privacy preserving data publishing that can be easily adapted to different use cases, from the generation of time-series to continuous data, and discrete data; no previous work has focused on the later class. Indeed, many use cases expose discrete data or at least a combination between categorical and numerical values. Thanks to the latest developments in deep learning and generative models, it is now possible to model rich-semantic data maintaining both the original distribution of the features and the correlations between them. The output of this framework is a deep network, namely a generator, able to create new data on demand. We demonstrate the efficiency of our approach on real datasets from the French public administration and classic benchmark datasets.