LG MLDec 4, 2018

Learning Vine Copula Models For Synthetic Data Generation

Yi Sun, Alfredo Cuesta-Infante, Kalyan Veeramachaneni

arXiv:1812.01226v15.751 citations

Originality Incremental advance

AI Analysis

This work addresses the model selection problem for vine copulas in synthetic data generation, offering an incremental improvement through neural network-based learning.

The paper tackled the challenge of selecting a vine copula model from exponentially many configurations by formulating structure learning with vector and reinforcement learning representations, using neural networks to find embeddings and generate structures. The approach achieved better data fit in terms of log-likelihood on synthetic and real-world datasets and generated high-quality synthetic samples.

A vine copula model is a flexible high-dimensional dependence model which uses only bivariate building blocks. However, the number of possible configurations of a vine copula grows exponentially as the number of variables increases, making model selection a major challenge in development. In this work, we formulate a vine structure learning problem with both vector and reinforcement learning representation. We use neural network to find the embeddings for the best possible vine model and generate a structure. Throughout experiments on synthetic and real-world datasets, we show that our proposed approach fits the data better in terms of log-likelihood. Moreover, we demonstrate that the model is able to generate high-quality samples in a variety of applications, making it a good candidate for synthetic data generation.

View on arXiv PDF

Similar