MLDec 22, 2020
Unbiased Gradient Estimation for Distributionally Robust LearningSoumyadip Ghosh, Mark Squillante
Seeking to improve model generalization, we consider a new approach based on distributionally robust learning (DRL) that applies stochastic gradient descent to the outer minimization problem. Our algorithm efficiently estimates the gradient of the inner maximization problem through multi-level Monte Carlo randomization. Leveraging theoretical results that shed light on why standard gradient estimators fail, we establish the optimal parameterization of the gradient estimators of our approach that balances a fundamental tradeoff between computation time and statistical variance. Numerical experiments demonstrate that our DRL approach yields significant benefits over previous work.
MLMay 22, 2018
Efficient Stochastic Gradient Descent for Learning with Distributionally Robust OptimizationSoumyadip Ghosh, Mark Squillante, Ebisa Wollega
Distributionally robust optimization (DRO) problems are increasingly seen as a viable method to train machine learning models for improved model generalization. These min-max formulations, however, are more difficult to solve. We therefore provide a new stochastic gradient descent algorithm to efficiently solve this DRO formulation. Our approach applies gradient descent to the outer minimization formulation and estimates the gradient of the inner maximization based on a sample average approximation. The latter uses a subset of the data in each iteration, progressively increasing the subset size to ensure convergence. Theoretical results include establishing the optimal manner for growing the support size to balance a fundamental tradeoff between stochastic error and computational effort. Empirical results demonstrate the significant benefits of our approach over previous work, and also illustrate how learning with DRO can improve generalization.