Minimax Statistical Learning with Wasserstein Distances
This work addresses robustness in statistical learning for scenarios with distribution shifts, but it appears incremental as it builds on existing distributionally robust optimization methods.
The paper tackles the problem of distributionally robust optimization by proposing a minimax framework using Wasserstein distances, proving generalization bounds based on covering numbers and applying it to domain adaptation with guarantees when Wasserstein distance is estimated from unlabeled samples.
As opposed to standard empirical risk minimization (ERM), distributionally robust optimization aims to minimize the worst-case risk over a larger ambiguity set containing the original empirical distribution of the training data. In this work, we describe a minimax framework for statistical learning with ambiguity sets given by balls in Wasserstein space. In particular, we prove generalization bounds that involve the covering number properties of the original ERM problem. As an illustrative example, we provide generalization guarantees for transport-based domain adaptation problems where the Wasserstein distance between the source and target domain distributions can be reliably estimated from unlabeled samples.