ML LG OCNov 9, 2023

Outlier-Robust Wasserstein DRO

Sloan Nietert, Ziv Goldfeld, Soroosh Shafiee

arXiv:2311.05573v116.824 citationsh-index: 1

Originality Incremental advance

AI Analysis

This addresses a critical limitation in robust decision-making for machine learning applications, though it is an incremental improvement over existing WDRO methods.

The paper tackles the problem of distributionally robust optimization (DRO) being vulnerable to adversarial outliers by proposing an outlier-robust Wasserstein DRO framework that handles both geometric and non-geometric perturbations, achieving minimax optimal excess risk bounds and efficient computation.

Distributionally robust optimization (DRO) is an effective approach for data-driven decision-making in the presence of uncertainty. Geometric uncertainty due to sampling or localized perturbations of data points is captured by Wasserstein DRO (WDRO), which seeks to learn a model that performs uniformly well over a Wasserstein ball centered around the observed data distribution. However, WDRO fails to account for non-geometric perturbations such as adversarial outliers, which can greatly distort the Wasserstein distance measurement and impede the learned model. We address this gap by proposing a novel outlier-robust WDRO framework for decision-making under both geometric (Wasserstein) perturbations and non-geometric (total variation (TV)) contamination that allows an $\varepsilon$-fraction of data to be arbitrarily corrupted. We design an uncertainty set using a certain robust Wasserstein ball that accounts for both perturbation types and derive minimax optimal excess risk bounds for this procedure that explicitly capture the Wasserstein and TV risks. We prove a strong duality result that enables tractable convex reformulations and efficient computation of our outlier-robust WDRO problem. When the loss function depends only on low-dimensional features of the data, we eliminate certain dimension dependencies from the risk bounds that are unavoidable in the general setting. Finally, we present experiments validating our theory on standard regression and classification tasks.

View on arXiv PDF

Similar