LGMLSep 28, 2020

CASTLE: Regularization via Auxiliary Causal Graph Discovery

arXiv:2009.13180v179 citations
Originality Highly original
AI Analysis

This work addresses the need for more robust regularization methods in machine learning by integrating causality, offering potential improvements for practitioners dealing with generalization issues.

The paper tackles the problem of improving generalization in supervised models by incorporating causal structure learning into regularization, resulting in consistently better out-of-sample predictions compared to other benchmark regularizers.

Regularization improves generalization of supervised models to out-of-sample data. Prior works have shown that prediction in the causal direction (effect from cause) results in lower testing error than the anti-causal direction. However, existing regularization methods are agnostic of causality. We introduce Causal Structure Learning (CASTLE) regularization and propose to regularize a neural network by jointly learning the causal relationships between variables. CASTLE learns the causal directed acyclical graph (DAG) as an adjacency matrix embedded in the neural network's input layers, thereby facilitating the discovery of optimal predictors. Furthermore, CASTLE efficiently reconstructs only the features in the causal DAG that have a causal neighbor, whereas reconstruction-based regularizers suboptimally reconstruct all input features. We provide a theoretical generalization bound for our approach and conduct experiments on a plethora of synthetic and real publicly available datasets demonstrating that CASTLE consistently leads to better out-of-sample predictions as compared to other popular benchmark regularizers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes