LG MLSep 28, 2020

CASTLE: Regularization via Auxiliary Causal Graph Discovery

Trent Kyono, Yao Zhang, Mihaela van der Schaar

arXiv:2009.13180v118.179 citations

Originality Highly original

AI Analysis

This work addresses the need for more robust regularization methods in machine learning by integrating causality, offering potential improvements for practitioners dealing with generalization issues.

The paper tackles the problem of improving generalization in supervised models by incorporating causal structure learning into regularization, resulting in consistently better out-of-sample predictions compared to other benchmark regularizers.

Regularization improves generalization of supervised models to out-of-sample data. Prior works have shown that prediction in the causal direction (effect from cause) results in lower testing error than the anti-causal direction. However, existing regularization methods are agnostic of causality. We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables. CASTLE learns the causal directed acyclical graph (DAG) as an adjacency matrix embedded in the neural network's input layers, thereby facilitating the discovery of optimal predictors. Furthermore, CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features. We provide a theoretical generalization bound for our approach and conduct experiments on a plethora of synthetic and real publicly available datasets demonstrating that CASTLE consistently leads to better out-of-sample predictions as compared to other popular benchmark regularizers.

View on arXiv PDF

Similar