Learning Identifiable Gaussian Bayesian Networks in Polynomial Time and Sample Complexity
This addresses a fundamental challenge in causal inference and probabilistic modeling for researchers and practitioners, offering a provably efficient solution for a specific class of networks, though it is incremental in improving theoretical guarantees.
The paper tackles the difficult problem of learning the directed acyclic graph (DAG) structure of Gaussian Bayesian networks from observational data, proposing a polynomial-time algorithm that recovers the true DAG with high probability using O(k^4 log p) samples, matching information-theoretic limits and outperforming existing state-of-the-art methods.
Learning the directed acyclic graph (DAG) structure of a Bayesian network from observational data is a notoriously difficult problem for which many hardness results are known. In this paper we propose a provably polynomial-time algorithm for learning sparse Gaussian Bayesian networks with equal noise variance --- a class of Bayesian networks for which the DAG structure can be uniquely identified from observational data --- under high-dimensional settings. We show that $O(k^4 \log p)$ number of samples suffices for our method to recover the true DAG structure with high probability, where $p$ is the number of variables and $k$ is the maximum Markov blanket size. We obtain our theoretical guarantees under a condition called Restricted Strong Adjacency Faithfulness, which is strictly weaker than strong faithfulness --- a condition that other methods based on conditional independence testing need for their success. The sample complexity of our method matches the information-theoretic limits in terms of the dependence on $p$. We show that our method out-performs existing state-of-the-art methods for learning Gaussian Bayesian networks in terms of recovering the true DAG structure while being comparable in speed to heuristic methods.