LGDec 3, 2021

Generative Adversarial Networks for Synthetic Data Generation: A Comparative Study

Claire Little, Mark Elliot, Richard Allmendinger, Sahel Shariati Samani

arXiv:2112.01925v18.431 citations

Originality Synthesis-oriented

AI Analysis

This work addresses data confidentiality issues for census data users, but it is incremental as it applies existing GAN methods to a new domain.

The study compared Generative Adversarial Networks (GANs) with traditional methods for generating synthetic census microdata, using utility and disclosure risk metrics to evaluate their performance.

Generative Adversarial Networks (GANs) are gaining increasing attention as a means for synthesising data. So far much of this work has been applied to use cases outside of the data confidentiality domain with a common application being the production of artificial images. Here we consider the potential application of GANs for the purpose of generating synthetic census microdata. We employ a battery of utility metrics and a disclosure risk metric (the Targeted Correct Attribution Probability) to compare the data produced by tabular GANs with those produced using orthodox data synthesis methods.

View on arXiv PDF

Similar