LG AIJun 30, 2023

FFPDG: Fast, Fair and Private Data Generation

Weijie Xu, Jinjin Zhao, Francis Iannacci, Bo Wang

Amazon

arXiv:2307.00161v19.813 citationsh-index: 9

Originality Incremental advance

AI Analysis

This addresses fairness and privacy concerns in synthetic data generation for machine learning applications, though it appears incremental as it builds on existing GAN-based methods.

The authors tackled the problem of generating synthetic data that balances fairness, privacy, and computational efficiency, showing that models trained on their generated data perform well in real-world inference scenarios with theoretical and empirical validation.

Generative modeling has been used frequently in synthetic data generation. Fairness and privacy are two big concerns for synthetic data. Although Recent GAN [\cite{goodfellow2014generative}] based methods show good results in preserving privacy, the generated data may be more biased. At the same time, these methods require high computation resources. In this work, we design a fast, fair, flexible and private data generation method. We show the effectiveness of our method theoretically and empirically. We show that models trained on data generated by the proposed method can perform well (in inference stage) on real application scenarios.

View on arXiv PDF

Similar