Sufficient Invariant Learning for Distribution Shift
This work addresses the challenge of distribution shift in machine learning, which is critical for real-world applications where training and test data differ, but it appears incremental as it builds on existing invariant learning methods.
The paper tackles the problem of learning robust models under distribution shifts by introducing the Sufficient Invariant Learning (SIL) framework, which learns a sufficient subset of invariant features, and proposes the ASGDRO algorithm that achieves robustness by seeking common flat minima across environments, with empirical evaluations confirming its effectiveness on multiple datasets.
Learning robust models under distribution shifts between training and test datasets is a fundamental challenge in machine learning. While learning invariant features across environments is a popular approach, it often assumes that these features are fully observed in both training and test sets, a condition frequently violated in practice. When models rely on invariant features absent in the test set, their robustness in new environments can deteriorate. To tackle this problem, we introduce a novel learning principle called the Sufficient Invariant Learning (SIL) framework, which focuses on learning a sufficient subset of invariant features rather than relying on a single feature. After demonstrating the limitation of existing invariant learning methods, we propose a new algorithm, Adaptive Sharpness-aware Group Distributionally Robust Optimization (ASGDRO), to learn diverse invariant features by seeking common flat minima across the environments. We theoretically demonstrate that finding a common flat minima enables robust predictions based on diverse invariant features. Empirical evaluations on multiple datasets, including our new benchmark, confirm ASGDRO's robustness against distribution shifts, highlighting the limitations of existing methods.