LGOct 5, 2022

ciDATGAN: Conditional Inputs for Tabular GANs

Gael Lederrey, Tim Hillel, Michel Bierlaire

arXiv:2210.02404v11.82 citationsh-index: 62

Originality Incremental advance

AI Analysis

This addresses the challenge of generating high-quality conditional synthetic tabular data for applications like data augmentation and bias correction, though it builds incrementally on existing DATGAN methods.

The authors tackled the problem of incorporating conditionality into tabular GANs, which previously relied on latent variables that restricted data generation, by proposing ciDATGAN, an evolution of DATGAN that outperforms state-of-the-art models and can unbias datasets and complete large synthetic datasets using smaller feeder data.

Conditionality has become a core component for Generative Adversarial Networks (GANs) for generating synthetic images. GANs are usually using latent conditionality to control the generation process. However, tabular data only contains manifest variables. Thus, latent conditionality either restricts the generated data or does not produce sufficiently good results. Therefore, we propose a new methodology to include conditionality in tabular GANs inspired by image completion methods. This article presents ciDATGAN, an evolution of the Directed Acyclic Tabular GAN (DATGAN) that has already been shown to outperform state-of-the-art tabular GAN models. First, we show that the addition of conditional inputs does hinder the model's performance compared to its predecessor. Then, we demonstrate that ciDATGAN can be used to unbias datasets with the help of well-chosen conditional inputs. Finally, it shows that ciDATGAN can learn the logic behind the data and, thus, be used to complete large synthetic datasets using data from a smaller feeder dataset.

View on arXiv PDF

Similar