LGMEMLJun 13, 2012

Learning the Bayesian Network Structure: Dirichlet Prior versus Data

arXiv:1206.3287v150 citations
Originality Incremental advance
AI Analysis

This work addresses a specific methodological issue in Bayesian statistics for researchers in graphical models, offering incremental theoretical and practical insights.

The paper analyzes the effect of the equivalent sample size (ESS) in Dirichlet priors on Bayesian network structure learning, showing that large ESS values favor edges when conditional distributions are non-uniform, and provides an analytical approximation for optimal ESS values in predictive tasks.

In the Bayesian approach to structure learning of graphical models, the equivalent sample size (ESS) in the Dirichlet prior over the model parameters was recently shown to have an important effect on the maximum-a-posteriori estimate of the Bayesian network structure. In our first contribution, we theoretically analyze the case of large ESS-values, which complements previous work: among other results, we find that the presence of an edge in a Bayesian network is favoured over its absence even if both the Dirichlet prior and the data imply independence, as long as the conditional empirical distribution is notably different from uniform. In our second contribution, we focus on realistic ESS-values, and provide an analytical approximation to the "optimal" ESS-value in a predictive sense (its accuracy is also validated experimentally): this approximation provides an understanding as to which properties of the data have the main effect determining the "optimal" ESS-value.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes