LG MLOct 1, 2021

ML4C: Seeing Causality Through Latent Vicinity

Haoyue Dai, Rui Ding, Yuanyuan Jiang, Shi Han, Dongmei Zhang

arXiv:2110.00637v410.617 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the fundamental question of how supervision benefits causal learning for researchers in causal inference, though it appears incremental as it builds on existing SCL frameworks.

The paper tackles the problem of Supervised Causal Learning (SCL) on discrete data by proposing ML4C, a method that classifies unshielded triples as v-structures to construct causal graphs, and demonstrates that it outperforms state-of-the-art algorithms in accuracy, reliability, robustness, and tolerance on benchmark datasets.

Supervised Causal Learning (SCL) aims to learn causal relations from observational data by accessing previously seen datasets associated with ground truth causal relations. This paper presents a first attempt at addressing a fundamental question: What are the benefits from supervision and how does it benefit? Starting from seeing that SCL is not better than random guessing if the learning target is non-identifiable a priori, we propose a two-phase paradigm for SCL by explicitly considering structure identifiability. Following this paradigm, we tackle the problem of SCL on discrete data and propose ML4C. The core of ML4C is a binary classifier with a novel learning target: it classifies whether an Unshielded Triple (UT) is a v-structure or not. Specifically, starting from an input dataset with the corresponding skeleton provided, ML4C orients each UT once it is classified as a v-structure. These v-structures are together used to construct the final output. To address the fundamental question of SCL, we propose a principled method for ML4C featurization: we exploit the vicinity of a given UT (i.e., the neighbors of UT in skeleton), and derive features by considering the conditional dependencies and structural entanglement within the vicinity. We further prove that ML4C is asymptotically correct. Last but foremost, thorough experiments conducted on benchmark datasets demonstrate that ML4C remarkably outperforms other state-of-the-art algorithms in terms of accuracy, reliability, robustness and tolerance. In summary, ML4C shows promising results on validating the effectiveness of supervision for causal learning. Our codes are publicly available at https://github.com/microsoft/ML4C.

View on arXiv PDF Code

Similar