LG OC MLAug 19, 2021

Risk Bounds and Calibration for a Smart Predict-then-Optimize Method

arXiv:2108.08887v216.029 citations

Originality Incremental advance

AI Analysis

This work addresses decision-making problems in stochastic optimization, providing theoretical guarantees for a method that improves over standard prediction losses, though it is incremental as it builds on prior SPO+ loss research.

The paper tackles the challenge of decision-making under uncertainty by expanding consistency results for the SPO+ loss, a convex surrogate in the predict-then-optimize framework, and develops risk bounds and calibration results to quantify excess true risk, with experiments showing its strength on portfolio allocation and classification problems.

The predict-then-optimize framework is fundamental in practical stochastic decision-making problems: first predict unknown parameters of an optimization model, then solve the problem using the predicted values. A natural loss function in this setting is defined by measuring the decision error induced by the predicted parameters, which was named the Smart Predict-then-Optimize (SPO) loss by Elmachtoub and Grigas [arXiv:1710.08005]. Since the SPO loss is typically nonconvex and possibly discontinuous, Elmachtoub and Grigas [arXiv:1710.08005] introduced a convex surrogate, called the SPO+ loss, that importantly accounts for the underlying structure of the optimization model. In this paper, we greatly expand upon the consistency results for the SPO+ loss provided by Elmachtoub and Grigas [arXiv:1710.08005]. We develop risk bounds and uniform calibration results for the SPO+ loss relative to the SPO loss, which provide a quantitative way to transfer the excess surrogate risk to excess true risk. By combining our risk bounds with generalization bounds, we show that the empirical minimizer of the SPO+ loss achieves low excess true risk with high probability. We first demonstrate these results in the case when the feasible region of the underlying optimization problem is a polyhedron, and then we show that the results can be strengthened substantially when the feasible region is a level set of a strongly convex function. We perform experiments to empirically demonstrate the strength of the SPO+ surrogate, as compared to standard $\ell_1$ and squared $\ell_2$ prediction error losses, on portfolio allocation and cost-sensitive multi-class classification problems.

View on arXiv PDF

Similar