LGMay 21

Learning Causal Orderings for In-Context Tabular Prediction

Sascha Xu, Sarah Mameche, Jilles Vreeken

arXiv:2605.2233548.4

AI Analysis

For practitioners using tabular prediction models, this work addresses the unreliability of correlational methods under distribution shift by integrating causal structure learning directly into the predictive architecture.

TabOrder learns causal variable orderings for in-context tabular prediction, improving robustness under distribution shift and intervention. It recovers accurate orderings while achieving strong predictive and imputation performance.

In-context learning for tabular data sets strong predictive standards in observational settings; it however primarily relies on correlational structure, which becomes unreliable under distribution shift or intervention. While established methods to discover causal structure exist, they are often focused on structure identifiability and decoupled from the predictive architectures that could benefit from them. To bridge these perspectives, we study how to simultaneously infer and enforce causal structure in the form of topological variable orderings into tabular prediction. Unlike standard architectures, our model TabOrder uses causal order-constrained attention, basing predictions only on features that precede a target under a learned causal order. Similar to causal discovery methods, TabOrder learns the optimal variable ordering in an unsupervised manner through a likelihood-based objective. We justify this choice under standard functional model classes and also study how sample missingness, a common challenge in tabular data, interacts with causal direction identification. Empirically, we confirm that TabOrder recovers accurate variable orderings while addressing prediction and imputation tasks, as well as gives insight into real-world biological data under intervention.

View on arXiv PDF

Similar