LGOCMLFeb 5, 2024

Decision-Focused Learning with Directional Gradients

arXiv:2402.03256v411 citationsh-index: 3NIPS
Originality Highly original
AI Analysis

This addresses the challenge of discontinuous decision losses in optimization for machine learning practitioners, offering a novel solution with asymptotic guarantees.

The paper tackles the problem of decision-focused learning in predict-then-optimize frameworks by proposing Perturbation Gradient losses, which are Lipschitz continuous and yield a best-in-class policy asymptotically with vanishing approximation error, outperforming existing methods in misspecified settings.

We propose a novel family of decision-aware surrogate losses, called Perturbation Gradient (PG) losses, for the predict-then-optimize framework. The key idea is to connect the expected downstream decision loss with the directional derivative of a particular plug-in objective, and then approximate this derivative using zeroth order gradient techniques. Unlike the original decision loss which is typically piecewise constant and discontinuous, our new PG losses is a Lipschitz continuous, difference of concave functions that can be optimized using off-the-shelf gradient-based methods. Most importantly, unlike existing surrogate losses, the approximation error of our PG losses vanishes as the number of samples grows. Hence, optimizing our surrogate loss yields a best-in-class policy asymptotically, even in misspecified settings. This is the first such result in misspecified settings, and we provide numerical evidence confirming our PG losses substantively outperform existing proposals when the underlying model is misspecified.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes