Mining Invariance from Nonlinear Multi-Environment Data: Binary Classification
This work addresses the challenge of out-of-distribution generalization for binary classification, which is incremental as it builds on invariance perspectives in multi-environment data.
The paper tackles the problem of making predictions in unseen environments using data from multiple training environments by identifying a unique form of invariance specific to binary classification, and shows it is robust under varying conditions.
Making predictions in an unseen environment given data from multiple training environments is a challenging task. We approach this problem from an invariance perspective, focusing on binary classification to shed light on general nonlinear data generation mechanisms. We identify a unique form of invariance that exists solely in a binary setting that allows us to train models invariant over environments. We provide sufficient conditions for such invariance and show it is robust even when environmental conditions vary greatly. Our formulation admits a causal interpretation, allowing us to compare it with various frameworks. Finally, we propose a heuristic prediction method and conduct experiments using real and synthetic datasets.