ML LGJun 27, 2025

Bayesian Invariance Modeling of Multi-Environment Data

Luhuan Wu, Mingzhang Yin, Yixin Wang, John P. Cunningham, David M. Blei

arXiv:2506.22675v312.35 citationsh-index: 101

Originality Incremental advance

AI Analysis

This work addresses the challenge of robust prediction and causal discovery across diverse environments, representing an incremental improvement over prior methods.

The paper tackles the problem of identifying invariant features in multi-environment data to improve generalization and causal inference, by developing Bayesian Invariant Prediction (BIP) and an efficient variational approximation (VI-BIP), which are shown to be more accurate and scalable than existing methods in simulations and real data.

Invariant prediction [Peters et al., 2016] analyzes feature/outcome data from multiple environments to identify invariant features - those with a stable predictive relationship to the outcome. Such features support generalization to new environments and help reveal causal mechanisms. Previous methods have primarily tackled this problem through hypothesis testing or regularized optimization. Here we develop Bayesian Invariant Prediction (BIP), a probabilistic model for invariant prediction. BIP encodes the indices of invariant features as a latent variable and recover them by posterior inference. Under the assumptions of Peters et al. [2016], the BIP posterior targets the true invariant features. We prove that the posterior is consistent and that greater environment heterogeneity leads to faster posterior contraction. To handle many features, we design an efficient variational approximation called VI-BIP. In simulations and real data, we find that BIP and VI-BIP are more accurate and scalable than existing methods for invariant prediction.

View on arXiv PDF

Similar