Distributionally Robust Learning with Stable Adversarial Training
This addresses robustness in machine learning for applications where data distributions may change, though it appears incremental as it builds on existing distributionally robust optimization methods.
The paper tackles the problem of machine learning models being vulnerable to distributional shifts by proposing Stable Adversarial Learning (SAL), which constructs a more practical uncertainty set based on stable correlations and achieves uniformly good performance across unknown shifts.
Machine learning algorithms with empirical risk minimization are vulnerable under distributional shifts due to the greedy adoption of all the correlations found in training data. There is an emerging literature on tackling this problem by minimizing the worst-case risk over an uncertainty set. However, existing methods mostly construct ambiguity sets by treating all variables equally regardless of the stability of their correlations with the target, resulting in the overwhelmingly-large uncertainty set and low confidence of the learner. In this paper, we propose a novel Stable Adversarial Learning (SAL) algorithm that leverages heterogeneous data sources to construct a more practical uncertainty set and conduct differentiated robustness optimization, where covariates are differentiated according to the stability of their correlations with the target. We theoretically show that our method is tractable for stochastic gradient-based optimization and provide the performance guarantees for our method. Empirical studies on both simulation and real datasets validate the effectiveness of our method in terms of uniformly good performance across unknown distributional shifts.