LGITMLJan 27, 2016

Information-theoretic limits of Bayesian network structure learning

arXiv:1601.07460v429 citations
Originality Incremental advance
AI Analysis

This provides fundamental limits for structure learning in Bayesian networks, which is crucial for fields like machine learning and statistics, though it is incremental as it extends existing information-theoretic methods.

The paper tackles the problem of determining the minimum number of samples needed to learn the structure of Bayesian networks from data, showing that it scales as Ω(m) for non-sparse networks and Ω(k log m + (k^2/m)) for sparse networks, where m is the number of variables and k is the maximum parents per node.

In this paper, we study the information-theoretic limits of learning the structure of Bayesian networks (BNs), on discrete as well as continuous random variables, from a finite number of samples. We show that the minimum number of samples required by any procedure to recover the correct structure grows as $Ω(m)$ and $Ω(k \log m + (k^2/m))$ for non-sparse and sparse BNs respectively, where $m$ is the number of variables and $k$ is the maximum number of parents per node. We provide a simple recipe, based on an extension of the Fano's inequality, to obtain information-theoretic limits of structure recovery for any exponential family BN. We instantiate our result for specific conditional distributions in the exponential family to characterize the fundamental limits of learning various commonly used BNs, such as conditional probability table based networks, gaussian BNs, noisy-OR networks, and logistic regression networks. En route to obtaining our main results, we obtain tight bounds on the number of sparse and non-sparse essential-DAGs. Finally, as a byproduct, we recover the information-theoretic limits of sparse variable selection for logistic regression.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes