LG AI MLOct 19, 2012

Large-Sample Learning of Bayesian Networks is NP-Hard

David Maxwell Chickering, Christopher Meek, David Heckerman

arXiv:1212.2468v1841 citations

Originality Incremental advance

AI Analysis

This is an incremental result that establishes computational hardness for a specific class of learning algorithms in machine learning.

The paper tackles the problem of learning discrete-variable Bayesian networks from large datasets, showing that identifying high-scoring structures is NP-hard even with access to oracles and for networks with limited parent nodes.

In this paper, we provide new complexity results for algorithms that learn discrete-variable Bayesian networks from data. Our results apply whenever the learning algorithm uses a scoring criterion that favors the simplest model able to represent the generative distribution exactly. Our results therefore hold whenever the learning algorithm uses a consistent scoring criterion and is applied to a sufficiently large dataset. We show that identifying high-scoring structures is hard, even when we are given an independence oracle, an inference oracle, and/or an information oracle. Our negative results also apply to the learning of discrete-variable Bayesian networks in which each node has at most k parents, for all k > 3.

View on arXiv PDF

Similar