LGMLMay 27, 2019

Distributionally Robust Optimization and Generalization in Kernel Methods

arXiv:1905.10943v1154 citations
Originality Incremental advance
AI Analysis

This work addresses robustness and generalization in machine learning, offering incremental theoretical insights by connecting DRO to existing regularization methods.

The paper tackled the problem of distributionally robust optimization (DRO) by proposing uncertainty sets based on maximum mean discrepancy (MMD), showing it is roughly equivalent to Hilbert norm regularization and providing an alternative proof of a generalization bound for Gaussian kernel ridge regression.

Distributionally robust optimization (DRO) has attracted attention in machine learning due to its connections to regularization, generalization, and robustness. Existing work has considered uncertainty sets based on phi-divergences and Wasserstein distances, each of which have drawbacks. In this paper, we study DRO with uncertainty sets measured via maximum mean discrepancy (MMD). We show that MMD DRO is roughly equivalent to regularization by the Hilbert norm and, as a byproduct, reveal deep connections to classic results in statistical learning. In particular, we obtain an alternative proof of a generalization bound for Gaussian kernel ridge regression via a DRO lense. The proof also suggests a new regularizer. Our results apply beyond kernel methods: we derive a generically applicable approximation of MMD DRO, and show that it generalizes recent work on variance-based regularization.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes