OC LG MLJun 12, 2020

Kernel Distributionally Robust Optimization

Jia-Jie Zhu, Wittawat Jitkrittum, Moritz Diehl, Bernhard Schölkopf

arXiv:2006.06981v310.516 citations

Originality Highly original

AI Analysis

This work provides a foundational framework for robust optimization in machine learning, potentially impacting various domains by improving generalization under distribution shifts.

The authors tackled the problem of distributionally robust optimization (DRO) by proposing Kernel DRO, which uses reproducing kernel Hilbert spaces to construct convex ambiguity sets, unifying existing methods and enabling application to a broad class of loss functions without limitations like polynomial losses or Lipschitz constant knowledge.

We propose kernel distributionally robust optimization (Kernel DRO) using insights from the robust optimization theory and functional analysis. Our method uses reproducing kernel Hilbert spaces (RKHS) to construct a wide range of convex ambiguity sets, which can be generalized to sets based on integral probability metrics and finite-order moment bounds. This perspective unifies multiple existing robust and stochastic optimization methods. We prove a theorem that generalizes the classical duality in the mathematical problem of moments. Enabled by this theorem, we reformulate the maximization with respect to measures in DRO into the dual program that searches for RKHS functions. Using universal RKHSs, the theorem applies to a broad class of loss functions, lifting common limitations such as polynomial losses and knowledge of the Lipschitz constant. We then establish a connection between DRO and stochastic optimization with expectation constraints. Finally, we propose practical algorithms based on both batch convex solvers and stochastic functional gradient, which apply to general optimization and machine learning tasks.

View on arXiv PDF

Similar