Contextual Distributionally Robust Optimization with Causal and Continuous Structure: An Interpretable and Tractable Approach

arXiv:2601.11016v11.7h-index: 7

Originality Incremental advance

AI Analysis

This addresses robust decision-making under uncertainty for applications requiring causal interpretability, though it appears incremental as it builds on existing DRO and causal methods.

The paper tackles contextual distributionally robust optimization by developing a framework that incorporates causal and continuous structure, resulting in interpretable decision rules with an efficient algorithm achieving O(ε⁻⁴) convergence and superior performance on synthetic and real-world datasets.

In this paper, we introduce a framework for contextual distributionally robust optimization (DRO) that considers the causal and continuous structure of the underlying distribution by developing interpretable and tractable decision rules that prescribe decisions using covariates. We first introduce the causal Sinkhorn discrepancy (CSD), an entropy-regularized causal Wasserstein distance that encourages continuous transport plans while preserving the causal consistency. We then formulate a contextual DRO model with a CSD-based ambiguity set, termed Causal Sinkhorn DRO (Causal-SDRO), and derive its strong dual reformulation where the worst-case distribution is characterized as a mixture of Gibbs distributions. To solve the corresponding infinite-dimensional policy optimization, we propose the Soft Regression Forest (SRF) decision rule, which approximates optimal policies within arbitrary measurable function spaces. The SRF preserves the interpretability of classical decision trees while being fully parametric, differentiable, and Lipschitz smooth, enabling intrinsic interpretation from both global and local perspectives. To solve the Causal-SDRO with parametric decision rules, we develop an efficient stochastic compositional gradient algorithm that converges to an $\varepsilon$-stationary point at a rate of $O(\varepsilon^{-4})$, matching the convergence rate of standard stochastic gradient descent. Finally, we validate our method through numerical experiments on synthetic and real-world datasets, demonstrating its superior performance and interpretability.

View on arXiv PDF

Similar