Claudia Czado

ML
h-index45
4papers
42citations
Novelty51%
AI Score41

4 Papers

MLMar 24
Stepwise Variational Inference with Vine Copulas

Elisabeth Griesbauer, Leiv Rønneberg, Arnoldo Frigessi et al.

We propose stepwise variational inference (VI) with vine copulas: a universal VI procedure that combines vine copulas with a novel stepwise estimation procedure of the variational parameters. Vine copulas consist of a nested sequence of trees built from copulas, where more complex latent dependence can be modeled with increasing number of trees. We propose to estimate the vine copula approximate posterior in a stepwise fashion, tree by tree along the vine structure. Further, we show that the usual backward Kullback-Leibler divergence cannot recover the correct parameters in the vine copula model, thus the evidence lower bound is defined based on the Rényi divergence. Finally, an intuitive stopping criterion for adding further trees to the vine eliminates the need to pre-define a complexity parameter of the variational distribution, as required for most other approaches. Thus, our method interpolates between mean-field VI (MFVI) and full latent dependence. In many applications, in particular sparse Gaussian processes, our method is parsimonious with parameters, while outperforming MFVI.

LGMar 20, 2025
TVineSynth: A Truncated C-Vine Copula Generator of Synthetic Tabular Data to Balance Privacy and Utility

Elisabeth Griesbauer, Claudia Czado, Arnoldo Frigessi et al.

We propose TVineSynth, a vine copula based synthetic tabular data generator, which is designed to balance privacy and utility, using the vine tree structure and its truncation to do the trade-off. Contrary to synthetic data generators that achieve DP by globally adding noise, TVineSynth performs a controlled approximation of the estimated data generating distribution, so that it does not suffer from poor utility of the resulting synthetic data for downstream prediction tasks. TVineSynth introduces a targeted bias into the vine copula model that, combined with the specific tree structure of the vine, causes the model to zero out privacy-leaking dependencies while relying on those that are beneficial for utility. Privacy is here measured with membership (MIA) and attribute inference attacks (AIA). Further, we theoretically justify how the construction of TVineSynth ensures AIA privacy under a natural privacy measure for continuous sensitive attributes. When compared to competitor models, with and without DP, on simulated and on real-world data, TVineSynth achieves a superior privacy-utility balance.

MEFeb 5, 2021
Vine copula mixture models and clustering for non-Gaussian data

Özge Sahin, Claudia Czado

The majority of finite mixture models suffer from not allowing asymmetric tail dependencies within components and not capturing non-elliptical clusters in clustering applications. Since vine copulas are very flexible in capturing these types of dependencies, we propose a novel vine copula mixture model for continuous data. We discuss the model selection and parameter estimation problems and further formulate a new model-based clustering algorithm. The use of vine copulas in clustering allows for a range of shapes and dependency structures for the clusters. Our simulation experiments illustrate a significant gain in clustering accuracy when notably asymmetric tail dependencies or/and non-Gaussian margins within the components exist. The analysis of real data sets accompanies the proposed method. We show that the model-based clustering algorithm with vine copula mixture models outperforms the other model-based clustering techniques, especially for the non-Gaussian multivariate data.

MLSep 15, 2017
Dependence Modeling in Ultra High Dimensions with Vine Copulas and the Graphical Lasso

Dominik Müller, Claudia Czado

To model high dimensional data, Gaussian methods are widely used since they remain tractable and yield parsimonious models by imposing strong assumptions on the data. Vine copulas are more flexible by combining arbitrary marginal distributions and (conditional) bivariate copulas. Yet, this adaptability is accompanied by sharply increasing computational effort as the dimension increases. The approach proposed in this paper overcomes this burden and makes the first step into ultra high dimensional non-Gaussian dependence modeling by using a divide-and-conquer approach. First, we apply Gaussian methods to split datasets into feasibly small subsets and second, apply parsimonious and flexible vine copulas thereon. Finally, we reconcile them into one joint model. We provide numerical results demonstrating the feasibility of our approach in moderate dimensions and showcase its ability to estimate ultra high dimensional non-Gaussian dependence models in thousands of dimensions.