MLApr 13, 2023
Estimate-Then-Optimize versus Integrated-Estimation-Optimization versus Sample Average Approximation: A Stochastic Dominance PerspectiveAdam N. Elmachtoub, Henry Lam, Haofeng Zhang et al.
In data-driven stochastic optimization, model parameters of the underlying distribution need to be estimated from data in addition to the optimization task. Recent literature considers integrating the estimation and optimization processes by selecting model parameters that lead to the best empirical objective performance. This integrated approach, which we call integrated-estimation-optimization (IEO), can be readily shown to outperform simple estimate-then-optimize (ETO) when the model is misspecified. In this paper, we show that a reverse behavior appears when the model class is well-specified and there is sufficient data. Specifically, for a general class of nonlinear stochastic optimization problems, we show that simple ETO outperforms IEO asymptotically when the model class covers the ground truth, in the strong sense of stochastic dominance of the regret. Namely, the entire distribution of the regret, not only its mean or other moments, is always better for ETO compared to IEO. Our results also apply to constrained, contextual optimization problems where the decision depends on observed features. Whenever applicable, we also demonstrate how standard sample average approximation (SAA) performs the worst when the model class is well-specified in terms of regret, and best when it is misspecified. Finally, we provide experimental results to support our theoretical comparisons and illustrate when our insights hold in finite-sample regimes and under various degrees of misspecification.
MLFeb 24, 2023
Balanced Off-Policy Evaluation for Personalized PricingAdam N. Elmachtoub, Vishal Gupta, Yunfan Zhao
We consider a personalized pricing problem in which we have data consisting of feature information, historical pricing decisions, and binary realized demand. The goal is to perform off-policy evaluation for a new personalized pricing policy that maps features to prices. Methods based on inverse propensity weighting (including doubly robust methods) for off-policy evaluation may perform poorly when the logging policy has little exploration or is deterministic, which is common in pricing applications. Building on the balanced policy evaluation framework of Kallus (2018), we propose a new approach tailored to pricing applications. The key idea is to compute an estimate that minimizes the worst-case mean squared error or maximizes a worst-case lower bound on policy performance, where in both cases the worst-case is taken with respect to a set of possible revenue functions. We establish theoretical convergence guarantees and empirically demonstrate the advantage of our approach using a real-world pricing dataset.
GTApr 4
Fair Aggregation in Virtual Power PlantsLiudong Chen, Hyemi Kim, Adam N. Elmachtoub et al.
A virtual power plant (VPP) is operated by an aggregator that acts as a market intermediary, aggregating consumers to participate in wholesale power markets. By setting incentive prices, the aggregator induces consumers to sell energy and profits by providing this aggregated energy to the market. This supply is enabled by consumers' flexibility to adjust electricity consumption in response to market conditions. However, heterogeneity in flexibility means that profit-maximizing VPP pricing can create inequalities in participation and benefit allocation across consumers. In this paper, we develop a fairness-aware pricing framework to analyze how different fairness notions reshape system performance, measured by consumer Nash welfare, total consumer utility, and social welfare. We consider three fairness criteria: energy fairness, which ensures equitable energy provision; price fairness, which ensures similar incentive prices; and utility fairness, which ensures comparable levels of consumer utility. We model the aggregator-consumer interaction as a Stackelberg game and derive consumers' optimal responses to incentive prices. Using a stylized model, we show that profit-only pricing systematically disadvantages less flexible consumers. We further show that energy fairness can either improve or worsen all performance measures, and gains across most measures arise only at moderate fairness levels. Surprisingly, price fairness never benefits less flexible consumers, even when it reduces price disparities. By contrast, utility fairness protects less flexible consumers without benefiting more flexible ones. We validate our findings using data from an experiment in Norway under a tiered pricing scheme. Our results provide regulators and VPP operators with a systematic map linking fairness definitions and enforcement levels to operational and welfare outcomes.
LGMar 1, 2025
Dissecting the Impact of Model Misspecification in Data-driven OptimizationAdam N. Elmachtoub, Henry Lam, Haixiang Lan et al.
Data-driven optimization aims to translate a machine learning model into decision-making by optimizing decisions on estimated costs. Such a pipeline can be conducted by fitting a distributional model which is then plugged into the target optimization problem. While this fitting can utilize traditional methods such as maximum likelihood, a more recent approach uses estimation-optimization integration that minimizes decision error instead of estimation error. Although intuitive, the statistical benefit of the latter approach is not well understood yet is important to guide the prescriptive usage of machine learning. In this paper, we dissect the performance comparisons between these approaches in terms of the amount of model misspecification. In particular, we show how the integrated approach offers a ``universal double benefit'' on the top two dominating terms of regret when the underlying model is misspecified, while the traditional approach can be advantageous when the model is nearly well-specified. Our comparison is powered by finite-sample tail regret bounds that are derived via new higher-order expansions of regrets and the leveraging of a recent Berry-Esseen theorem.
MLOct 21, 2025
The Bias-Variance Tradeoff in Data-Driven Optimization: A Local Misspecification PerspectiveHaixiang Lan, Luofeng Liao, Adam N. Elmachtoub et al.
Data-driven stochastic optimization is ubiquitous in machine learning and operational decision-making problems. Sample average approximation (SAA) and model-based approaches such as estimate-then-optimize (ETO) or integrated estimation-optimization (IEO) are all popular, with model-based approaches being able to circumvent some of the issues with SAA in complex context-dependent problems. Yet the relative performance of these methods is poorly understood, with most results confined to the dichotomous cases of the model-based approach being either well-specified or misspecified. We develop the first results that allow for a more granular analysis of the relative performance of these methods under a local misspecification setting, which models the scenario where the model-based approach is nearly well-specified. By leveraging tools from contiguity theory in statistics, we show that there is a bias-variance tradeoff between SAA, IEO, and ETO under local misspecification, and that the relative importance of the bias and the variance depends on the degree of local misspecification. Moreover, we derive explicit expressions for the decision bias, which allows us to characterize (un)impactful misspecification directions, and provide further geometric understanding of the variance.
LGMay 20, 2025
An active learning framework for multi-group mean estimationAbdellah Aznag, Rachel Cummings, Adam N. Elmachtoub
We study a fundamental learning problem over multiple groups with unknown data distributions, where an analyst would like to learn the mean of each group. Moreover, we want to ensure that this data is collected in a relatively fair manner such that the noise of the estimate of each group is reasonable. In particular, we focus on settings where data are collected dynamically, which is important in adaptive experimentation for online platforms or adaptive clinical trials for healthcare. In our model, we employ an active learning framework to sequentially collect samples with bandit feedback, observing a sample in each period from the chosen group. After observing a sample, the analyst updates their estimate of the mean and variance of that group and chooses the next group accordingly. The analyst's objective is to dynamically collect samples to minimize the collective noise of the estimators, measured by the norm of the vector of variances of the mean estimators. We propose an algorithm, Variance-UCB, that sequentially selects groups according to an upper confidence bound on the variance estimate. We provide a general theoretical framework for providing efficient bounds on learning from any underlying distribution where the variances can be estimated reasonably. This framework yields upper bounds on regret that improve significantly upon all existing bounds, as well as a collection of new results for different objectives and distributions than those previously studied.
LGFeb 29, 2020
Decision Trees for Decision-Making under the Predict-then-Optimize FrameworkAdam N. Elmachtoub, Jason Cheuk Nam Liang, Ryan McNellis
We consider the use of decision trees for decision-making problems under the predict-then-optimize framework. That is, we would like to first use a decision tree to predict unknown input parameters of an optimization problem, and then make decisions by solving the optimization problem using the predicted parameters. A natural loss function in this framework is to measure the suboptimality of the decisions induced by the predicted input parameters, as opposed to measuring loss using input parameter prediction error. This natural loss function is known in the literature as the Smart Predict-then-Optimize (SPO) loss, and we propose a tractable methodology called SPO Trees (SPOTs) for training decision trees under this loss. SPOTs benefit from the interpretability of decision trees, providing an interpretable segmentation of contextual features into groups with distinct optimal solutions to the optimization problem of interest. We conduct several numerical experiments on synthetic and real data including the prediction of travel times for shortest path problems and predicting click probabilities for news article recommendation. We demonstrate on these datasets that SPOTs simultaneously provide higher quality decisions and significantly lower model complexity than other machine learning approaches (e.g., CART) trained to minimize prediction error.
LGMay 27, 2019
Generalization Bounds in the Predict-then-Optimize FrameworkOthman El Balghiti, Adam N. Elmachtoub, Paul Grigas et al.
The predict-then-optimize framework is fundamental in many practical settings: predict the unknown parameters of an optimization problem, and then solve the problem using the predicted values of the parameters. A natural loss function in this environment is to consider the cost of the decisions induced by the predicted parameters, in contrast to the prediction error of the parameters. This loss function was recently introduced in Elmachtoub and Grigas (2022) and referred to as the Smart Predict-then-Optimize (SPO) loss. In this work, we seek to provide bounds on how well the performance of a prediction model fit on training data generalizes out-of-sample, in the context of the SPO loss. Since the SPO loss is non-convex and non-Lipschitz, standard results for deriving generalization bounds do not apply. We first derive bounds based on the Natarajan dimension that, in the case of a polyhedral feasible region, scale at most logarithmically in the number of extreme points, but, in the case of a general convex feasible region, have linear dependence on the decision dimension. By exploiting the structure of the SPO loss function and a key property of the feasible region, which we denote as the strength property, we can dramatically improve the dependence on the decision and feature dimensions. Our approach and analysis rely on placing a margin around problematic predictions that do not yield unique optimal solutions, and then providing generalization bounds in the context of a modified margin SPO loss function that is Lipschitz continuous. Finally, we characterize the strength property and show that the modified SPO loss can be computed efficiently for both strongly convex bodies and polytopes with an explicit extreme point representation.
OCOct 22, 2017
Smart "Predict, then Optimize"Adam N. Elmachtoub, Paul Grigas
Many real-world analytics problems involve two significant challenges: prediction and optimization. Due to the typically complex nature of each challenge, the standard paradigm is predict-then-optimize. By and large, machine learning tools are intended to minimize prediction error and do not account for how the predictions will be used in the downstream optimization problem. In contrast, we propose a new and very general framework, called Smart "Predict, then Optimize" (SPO), which directly leverages the optimization problem structure, i.e., its objective and constraints, for designing better prediction models. A key component of our framework is the SPO loss function which measures the decision error induced by a prediction. Training a prediction model with respect to the SPO loss is computationally challenging, and thus we derive, using duality theory, a convex surrogate loss function which we call the SPO+ loss. Most importantly, we prove that the SPO+ loss is statistically consistent with respect to the SPO loss under mild conditions. Our SPO+ loss function can tractably handle any polyhedral, convex, or even mixed-integer optimization problem with a linear objective. Numerical experiments on shortest path and portfolio optimization problems show that the SPO framework can lead to significant improvement under the predict-then-optimize paradigm, in particular when the prediction model being trained is misspecified. We find that linear models trained using SPO+ loss tend to dominate random forest algorithms, even when the ground truth is highly nonlinear.
LGJun 14, 2017
A Practical Method for Solving Contextual Bandit Problems Using Decision TreesAdam N. Elmachtoub, Ryan McNellis, Sechan Oh et al.
Many efficient algorithms with strong theoretical guarantees have been proposed for the contextual multi-armed bandit problem. However, applying these algorithms in practice can be difficult because they require domain expertise to build appropriate features and to tune their parameters. We propose a new method for the contextual bandit problem that is simple, practical, and can be applied with little or no domain expertise. Our algorithm relies on decision trees to model the context-reward relationship. Decision trees are non-parametric, interpretable, and work well without hand-crafted features. To guide the exploration-exploitation trade-off, we use a bootstrapping approach which abstracts Thompson sampling to non-Bayesian settings. We also discuss several computational heuristics and demonstrate the performance of our method on several datasets.