LGOCFeb 5, 2021

Integer Programming for Causal Structure Learning in the Presence of Latent Variables

arXiv:2102.03129v319 citations
AI Analysis

This work provides an exact, score-based method for learning causal structures with latent variables, which is a significant improvement for researchers and practitioners in causal inference who need to model more complex real-world scenarios.

This paper addresses the problem of learning causal structures represented by ancestral acyclic directed mixed graphs (ADMGs) in the presence of latent variables. The authors propose a novel exact score-based method using an integer programming (IP) formulation, which achieves better accuracy than existing heuristic score-based and benchmark constraint-based methods for medium-sized problems.

The problem of finding an ancestral acyclic directed mixed graph (ADMG) that represents the causal relationships between a set of variables is an important area of research on causal inference. Most existing score-based structure learning methods focus on learning directed acyclic graph (DAG) models without latent variables. A number of score-based methods have recently been proposed for the ADMG learning, yet they are heuristic in nature and do not guarantee an optimal solution. We propose a novel exact score-based method that solves an integer programming (IP) formulation and returns a score-maximizing ancestral ADMG for a set of continuous variables that follow a multivariate Gaussian distribution. We generalize the state-of-the-art IP model for DAG learning problems and derive new classes of valid inequalities to formulate an IP model for ADMG learning. Empirically, our model can be solved efficiently for medium-sized problems and achieves better accuracy than state-of-the-art score-based methods as well as benchmark constraint-based methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes