AIFeb 7, 2018

Efficient Learning of Bounded-Treewidth Bayesian Networks from Complete and Incomplete Data Sets

arXiv:1802.02468v130 citations
AI Analysis

This work addresses the computational efficiency challenge in learning Bayesian networks for probabilistic inference, offering a faster alternative for data imputation tasks, though it is incremental as it builds on existing structural EM methods.

The paper tackles the problem of learning bounded-treewidth Bayesian networks from both complete and incomplete datasets, introducing the k-MAX algorithm that scales to thousands of variables and yields higher-scoring structures than competitors on complete data. For incomplete data, it integrates k-MAX into structural EM to achieve the same imputation accuracy as state-of-the-art methods in about one-tenth of the time, with linear worst-case complexity and easy parallelizability.

Learning a Bayesian networks with bounded treewidth is important for reducing the complexity of the inferences. We present a novel anytime algorithm (k-MAX) method for this task, which scales up to thousands of variables. Through extensive experiments we show that it consistently yields higher-scoring structures than its competitors on complete data sets. We then consider the problem of structure learning from incomplete data sets. This can be addressed by structural EM, which however is computationally very demanding. We thus adopt the novel k-MAX algorithm in the maximization step of structural EM, obtaining an efficient computation of the expected sufficient statistics. We test the resulting structural EM method on the task of imputing missing data, comparing it against the state-of-the-art approach based on random forests. Our approach achieves the same imputation accuracy of the competitors, but in about one tenth of the time. Furthermore we show that it has worst-case complexity linear in the input size, and that it is easily parallelizable.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes