LGDec 3, 2021

Generative Adversarial Networks for Synthetic Data Generation: A Comparative Study

arXiv:2112.01925v131 citations
Originality Synthesis-oriented
AI Analysis

This work addresses data confidentiality issues for census data users, but it is incremental as it applies existing GAN methods to a new domain.

The study compared Generative Adversarial Networks (GANs) with traditional methods for generating synthetic census microdata, using utility and disclosure risk metrics to evaluate their performance.

Generative Adversarial Networks (GANs) are gaining increasing attention as a means for synthesising data. So far much of this work has been applied to use cases outside of the data confidentiality domain with a common application being the production of artificial images. Here we consider the potential application of GANs for the purpose of generating synthetic census microdata. We employ a battery of utility metrics and a disclosure risk metric (the Targeted Correct Attribution Probability) to compare the data produced by tabular GANs with those produced using orthodox data synthesis methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes