Generating Multi-Categorical Samples with Generative Adversarial Networks
This addresses a specific challenge in generating discrete data for applications like synthetic data creation, but it is incremental as it builds on existing GAN methods.
The paper tackles the problem of generating multi-categorical data using Generative Adversarial Networks (GANs), which struggle with discrete data, and proposes architectures based on Gumbel softmax layers that outperform existing models.
We propose a method to train generative adversarial networks on mutivariate feature vectors representing multiple categorical values. In contrast to the continuous domain, where GAN-based methods have delivered considerable results, GANs struggle to perform equally well on discrete data. We propose and compare several architectures based on multiple (Gumbel) softmax output layers taking into account the structure of the data. We evaluate the performance of our architecture on datasets with different sparsity, number of features, ranges of categorical values, and dependencies among the features. Our proposed architecture and method outperforms existing models.