LG AIJul 2, 2021

The Causal-Neural Connection: Expressiveness, Learnability, and Inference

Kevin Xia, Kai-Zhan Lee, Yoshua Bengio, Elias Bareinboim

arXiv:2107.00793v328.9149 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a foundational issue in causal inference for researchers, showing neural networks have inherent limitations in learning causal structures, which is incremental by formalizing constraints to bridge neural and causal models.

The paper tackles the problem of whether neural networks can learn any structural causal model (SCM) from data, showing they cannot predict intervention effects from observational data alone due to limits in learnability, and introduces neural causal models (NCMs) with a new inductive bias to enable causal identification and estimation, with simulations supporting the approach.

One of the central elements of any causal inference is an object called structural causal model (SCM), which represents a collection of mechanisms and exogenous sources of random variation of the system under investigation (Pearl, 2000). An important property of many kinds of neural networks is universal approximability: the ability to approximate any function to arbitrary precision. Given this property, one may be tempted to surmise that a collection of neural nets is capable of learning any SCM by training on data generated by that SCM. In this paper, we show this is not the case by disentangling the notions of expressivity and learnability. Specifically, we show that the causal hierarchy theorem (Thm. 1, Bareinboim et al., 2020), which describes the limits of what can be learned from data, still holds for neural models. For instance, an arbitrarily complex and expressive neural net is unable to predict the effects of interventions given observational data alone. Given this result, we introduce a special type of SCM called a neural causal model (NCM), and formalize a new type of inductive bias to encode structural constraints necessary for performing causal inferences. Building on this new class of models, we focus on solving two canonical tasks found in the literature known as causal identification and estimation. Leveraging the neural toolbox, we develop an algorithm that is both sufficient and necessary to determine whether a causal effect can be learned from data (i.e., causal identifiability); it then estimates the effect whenever identifiability holds (causal estimation). Simulations corroborate the proposed approach.

View on arXiv PDF Code

Similar