MLLGOCOct 21, 2023

Distributionally Robust Optimization with Bias and Variance Reduction

UW
arXiv:2310.13863v16 citationsh-index: 47
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient and robust optimization for risk-sensitive learning in ML, offering a practical solution for applications like distribution shift and fairness, though it is incremental in improving existing DRO methods.

The paper tackles the distributionally robust optimization problem with spectral risk-based uncertainty sets by introducing Prospect, a stochastic gradient-based algorithm that requires only a single learning rate hyperparameter and achieves linear convergence for smooth regularized losses, converging 2-3 times faster than baselines on benchmarks.

We consider the distributionally robust optimization (DRO) problem with spectral risk-based uncertainty set and $f$-divergence penalty. This formulation includes common risk-sensitive learning objectives such as regularized condition value-at-risk (CVaR) and average top-$k$ loss. We present Prospect, a stochastic gradient-based algorithm that only requires tuning a single learning rate hyperparameter, and prove that it enjoys linear convergence for smooth regularized losses. This contrasts with previous algorithms that either require tuning multiple hyperparameters or potentially fail to converge due to biased gradient estimates or inadequate regularization. Empirically, we show that Prospect can converge 2-3$\times$ faster than baselines such as stochastic gradient and stochastic saddle-point methods on distribution shift and fairness benchmarks spanning tabular, vision, and language domains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes