MLLGJun 5, 2020

Principled learning method for Wasserstein distributionally robust optimization with local perturbations

arXiv:2006.03333v216 citations
AI Analysis

This work addresses robustness in machine learning models for noisy data, representing an incremental advancement in WDRO theory and application.

The paper tackles the problem of improving theoretical understanding and practical robustness in Wasserstein distributionally robust optimization (WDRO) by proposing a minimizer based on a novel approximation theorem with risk consistency results, achieving significantly higher accuracy than baselines on noisy image datasets.

Wasserstein distributionally robust optimization (WDRO) attempts to learn a model that minimizes the local worst-case risk in the vicinity of the empirical data distribution defined by Wasserstein ball. While WDRO has received attention as a promising tool for inference since its introduction, its theoretical understanding has not been fully matured. Gao et al. (2017) proposed a minimizer based on a tractable approximation of the local worst-case risk, but without showing risk consistency. In this paper, we propose a minimizer based on a novel approximation theorem and provide the corresponding risk consistency results. Furthermore, we develop WDRO inference for locally perturbed data that include the Mixup (Zhang et al., 2017) as a special case. We show that our approximation and risk consistency results naturally extend to the cases when data are locally perturbed. Numerical experiments demonstrate robustness of the proposed method using image classification datasets. Our results show that the proposed method achieves significantly higher accuracy than baseline models on noisy datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes