LGMLNov 27, 2018

Synthesizing Tabular Data using Generative Adversarial Networks

arXiv:1811.11264v1319 citations
Originality Incremental advance
AI Analysis

This addresses the need for realistic synthetic data generation in domains like healthcare and education, but it is incremental as it adapts existing GAN methods to tabular data.

The paper tackles the problem of generating synthetic tabular data, such as medical or educational records, by introducing Tabular GAN (TGAN), a generative adversarial network that produces high-quality synthetic tables with both discrete and continuous variables. The result shows that TGAN outperforms conventional statistical generative models in capturing column correlations and scaling for large datasets across three evaluated datasets.

Generative adversarial networks (GANs) implicitly learn the probability distribution of a dataset and can draw samples from the distribution. This paper presents, Tabular GAN (TGAN), a generative adversarial network which can generate tabular data like medical or educational records. Using the power of deep neural networks, TGAN generates high-quality and fully synthetic tables while simultaneously generating discrete and continuous variables. When we evaluate our model on three datasets, we find that TGAN outperforms conventional statistical generative models in both capturing the correlation between columns and scaling up for large datasets.

Code Implementations7 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes