ML LGJun 28, 2019

Causal Regularization

arXiv:1906.12179v118.856 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of causal inference for researchers in statistics and machine learning, offering incremental theoretical insights into regularization's role in reducing confounding effects.

The paper tackles the problem of improving causal inference from observational data by showing that regularization in regression methods can yield better causal models even with infinite data, and provides a causal generalization bound for non-linear regression under specific confounding models.

I argue that regularizing terms in standard regression methods not only help against overfitting finite data, but sometimes also yield better causal models in the infinite sample regime. I first consider a multi-dimensional variable linearly influencing a target variable with some multi-dimensional unobserved common cause, where the confounding effect can be decreased by keeping the penalizing term in Ridge and Lasso regression even in the population limit. Choosing the size of the penalizing term, is however challenging, because cross validation is pointless. Here it is done by first estimating the strength of confounding via a method proposed earlier, which yielded some reasonable results for simulated and real data. Further, I prove a `causal generalization bound' which states (subject to a particular model of confounding) that the error made by interpreting any non-linear regression as causal model can be bounded from above whenever functions are taken from a not too rich class. In other words, the bound guarantees "generalization" from observational to interventional distributions, which is usually not subject of statistical learning theory (and is only possible due to the underlying symmetries of the confounder model).

View on arXiv PDF Code

Similar