Differentiable Distributionally Robust Optimization Layers
This work addresses a gap in decision-focused learning for researchers and practitioners dealing with uncertainty in optimization, offering a novel method for differentiable DRO layers, though it is incremental in extending existing paradigms to DRO.
The paper tackles the problem of embedding distributionally robust optimization (DRO) as a differentiable layer in learning pipelines, which was previously unknown, and develops a novel dual-view methodology for mixed-integer DRO problems with parameterized ambiguity sets, proving asymptotic convergence and demonstrating its application in contextual decision-making tasks.
In recent years, there has been a growing research interest in decision-focused learning, which embeds optimization problems as a layer in learning pipelines and demonstrates a superior performance than the prediction-focused approach. However, for distributionally robust optimization (DRO), a popular paradigm for decision-making under uncertainty, it is still unknown how to embed it as a layer, i.e., how to differentiate decisions with respect to an ambiguity set. In this paper, we develop such differentiable DRO layers for generic mixed-integer DRO problems with parameterized second-order conic ambiguity sets and discuss its extension to Wasserstein ambiguity sets. To differentiate the mixed-integer decisions, we propose a novel dual-view methodology by handling continuous and discrete parts of decisions via different principles. Specifically, we construct a differentiable energy-based surrogate to implement the dual-view methodology and use importance sampling to estimate its gradient. We further prove that such a surrogate enjoys the asymptotic convergency under regularization. As an application of the proposed differentiable DRO layers, we develop a novel decision-focused learning pipeline for contextual distributionally robust decision-making tasks and compare it with the prediction-focused approach in experiments.