ME ST MLDec 29, 2014

Marginal likelihood and model selection for Gaussian latent tree and forest models

Mathias Drton, Shaowei Lin, Luca Weihs, Piotr Zwiernik

arXiv:1412.8285v234 citations

Originality Synthesis-oriented

AI Analysis

This work addresses model selection for statisticians and machine learning practitioners working with latent variable models, but it is incremental as it extends existing theory to specific singular cases.

The authors tackled the problem of model selection in Gaussian latent tree and forest models by computing real log-canonical thresholds to quantify the marginal likelihood's large-sample behavior, enabling a generalized Bayesian information criterion, which they demonstrated in simulations and a data example.

Gaussian latent tree models, or more generally, Gaussian latent forest models have Fisher-information matrices that become singular along interesting submodels, namely, models that correspond to subforests. For these singularities, we compute the real log-canonical thresholds (also known as stochastic complexities or learning coefficients) that quantify the large-sample behavior of the marginal likelihood in Bayesian inference. This provides the information needed for a recently introduced generalization of the Bayesian information criterion. Our mathematical developments treat the general setting of Laplace integrals whose phase functions are sums of squared differences between monomials and constants. We clarify how in this case real log-canonical thresholds can be computed using polyhedral geometry, and we show how to apply the general theory to the Laplace integrals associated with Gaussian latent tree and forest models. In simulations and a data example, we demonstrate how the mathematical knowledge can be applied in model selection.

View on arXiv PDF

Similar