MLLGOCMay 22, 2018

Efficient Stochastic Gradient Descent for Learning with Distributionally Robust Optimization

arXiv:1805.08728v28 citations
Originality Incremental advance
AI Analysis

This addresses the computational difficulty of DRO formulations for machine learning practitioners seeking better generalization, though it is incremental as it builds on existing DRO methods.

The paper tackles the challenge of efficiently solving distributionally robust optimization (DRO) problems for improved model generalization by proposing a new stochastic gradient descent algorithm that balances stochastic error and computational effort, achieving significant empirical benefits over previous work.

Distributionally robust optimization (DRO) problems are increasingly seen as a viable method to train machine learning models for improved model generalization. These min-max formulations, however, are more difficult to solve. We therefore provide a new stochastic gradient descent algorithm to efficiently solve this DRO formulation. Our approach applies gradient descent to the outer minimization formulation and estimates the gradient of the inner maximization based on a sample average approximation. The latter uses a subset of the data in each iteration, progressively increasing the subset size to ensure convergence. Theoretical results include establishing the optimal manner for growing the support size to balance a fundamental tradeoff between stochastic error and computational effort. Empirical results demonstrate the significant benefits of our approach over previous work, and also illustrate how learning with DRO can improve generalization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes