LG OC MLFeb 29, 2020

Decision Trees for Decision-Making under the Predict-then-Optimize Framework

Adam N. Elmachtoub, Jason Cheuk Nam Liang, Ryan McNellis

arXiv:2003.00360v221.7141 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses decision-making problems in domains like routing and recommendation by providing an interpretable method, though it is incremental as it builds on existing SPO loss concepts.

The paper tackles decision-making under the predict-then-optimize framework by proposing SPO Trees (SPOTs), a method to train decision trees using the Smart Predict-then-Optimize (SPO) loss, which measures decision suboptimality rather than prediction error. In experiments on synthetic and real data, SPOTs achieve higher quality decisions and lower model complexity compared to methods like CART.

We consider the use of decision trees for decision-making problems under the predict-then-optimize framework. That is, we would like to first use a decision tree to predict unknown input parameters of an optimization problem, and then make decisions by solving the optimization problem using the predicted parameters. A natural loss function in this framework is to measure the suboptimality of the decisions induced by the predicted input parameters, as opposed to measuring loss using input parameter prediction error. This natural loss function is known in the literature as the Smart Predict-then-Optimize (SPO) loss, and we propose a tractable methodology called SPO Trees (SPOTs) for training decision trees under this loss. SPOTs benefit from the interpretability of decision trees, providing an interpretable segmentation of contextual features into groups with distinct optimal solutions to the optimization problem of interest. We conduct several numerical experiments on synthetic and real data including the prediction of travel times for shortest path problems and predicting click probabilities for news article recommendation. We demonstrate on these datasets that SPOTs simultaneously provide higher quality decisions and significantly lower model complexity than other machine learning approaches (e.g., CART) trained to minimize prediction error.

View on arXiv PDF Code

Similar