Coordinated Multi-Neighborhood Learning on a Directed Acyclic Graph
This work addresses the problem of efficient and reliable causal discovery for researchers in various disciplines who need to focus on specific target nodes in high-dimensional networks, representing an incremental improvement over existing methods.
The paper tackles the challenge of learning causal directed acyclic graph (DAG) structures in high-dimensional settings by developing a constraint-based method that estimates local structures around multiple user-specified target nodes, enabling coordination between neighborhoods without learning the entire DAG. Experimental results show the algorithm is more accurate in learning neighborhood structures with much less computational cost than standard methods.
Learning the structure of causal directed acyclic graphs (DAGs) is useful in many areas of machine learning and artificial intelligence, with wide applications. However, in the high-dimensional setting, it is challenging to obtain good empirical and theoretical results without strong and often restrictive assumptions. Additionally, it is questionable whether all of the variables purported to be included in the network are observable. It is of interest then to restrict consideration to a subset of the variables for relevant and reliable inferences. In fact, researchers in various disciplines can usually select a set of target nodes in the network for causal discovery. This paper develops a new constraint-based method for estimating the local structure around multiple user-specified target nodes, enabling coordination in structure learning between neighborhoods. Our method facilitates causal discovery without learning the entire DAG structure. We establish consistency results for our algorithm with respect to the local neighborhood structure of the target nodes in the true graph. Experimental results on synthetic and real-world data show that our algorithm is more accurate in learning the neighborhood structures with much less computational cost than standard methods that estimate the entire DAG. An R package implementing our methods may be accessed at https://github.com/stephenvsmith/CML.