Decorr: Environment Partitioning for Invariant Learning and OOD Generalization
This addresses a practical bottleneck for practitioners applying invariant learning methods when environments aren't inherent in the data, though it appears incremental as it builds on existing invariant learning frameworks.
The paper tackles the problem of environment partitioning for invariant learning in out-of-distribution generalization by proposing Decorr, a method that isolates low-correlation data subsets to create environments, and demonstrates superior performance in experiments with synthetic and real data.
Invariant learning methods, aimed at identifying a consistent predictor across multiple environments, are gaining prominence in out-of-distribution (OOD) generalization. Yet, when environments aren't inherent in the data, practitioners must define them manually. This environment partitioning--algorithmically segmenting the training dataset into environments--crucially affects invariant learning's efficacy but remains underdiscussed. Proper environment partitioning could broaden the applicability of invariant learning and enhance its performance. In this paper, we suggest partitioning the dataset into several environments by isolating low-correlation data subsets. Through experiments with synthetic and real data, our Decorr method demonstrates superior performance in combination with invariant learning. Decorr mitigates the issue of spurious correlations, aids in identifying stable predictors, and broadens the applicability of invariant learning methods.