From dependency to causality: a machine learning approach
This work addresses the challenge of causal inference in complex systems for researchers in statistics and machine learning, representing an incremental advancement by extending existing data-driven methods to higher-dimensional cases.
The paper tackles the problem of inferring directed causal links between variables in multivariate settings with more than two variables, using a supervised machine learning approach based on asymmetries in conditional independence relations, and shows that this method can successfully extract causal information.
The relationship between statistical dependency and causality lies at the heart of all statistical approaches to causal inference. Recent results in the ChaLearn cause-effect pair challenge have shown that causal directionality can be inferred with good accuracy also in Markov indistinguishable configurations thanks to data driven approaches. This paper proposes a supervised machine learning approach to infer the existence of a directed causal link between two variables in multivariate settings with $n>2$ variables. The approach relies on the asymmetry of some conditional (in)dependence relations between the members of the Markov blankets of two variables causally connected. Our results show that supervised learning methods may be successfully used to extract causal information on the basis of asymmetric statistical descriptors also for $n>2$ variate distributions.