EM LG ST MEJul 15, 2025

Inference on Optimal Policy Values and Other Irregular Functionals via Smoothing

Justin Whitehouse, Morgane Austern, Vasilis Syrgkanis

arXiv:2507.11780v14.34 citationsh-index: 39

Originality Incremental advance

AI Analysis

This addresses a key challenge in causal inference for developing individualized treatment regimes, offering a method that balances computational feasibility with statistical robustness, though it is incremental in improving upon existing smoothing approaches.

The paper tackles the problem of constructing confidence intervals for the optimal policy value in causal inference, which is non-differentiable, by developing a softmax smoothing-based estimator that achieves √n convergence rates, avoids parametric assumptions, and is often statistically efficient.

Constructing confidence intervals for the value of an optimal treatment policy is an important problem in causal inference. Insight into the optimal policy value can guide the development of reward-maximizing, individualized treatment regimes. However, because the functional that defines the optimal value is non-differentiable, standard semi-parametric approaches for performing inference fail to be directly applicable. Existing approaches for handling this non-differentiability fall roughly into two camps. In one camp are estimators based on constructing smooth approximations of the optimal value. These approaches are computationally lightweight, but typically place unrealistic parametric assumptions on outcome regressions. In another camp are approaches that directly de-bias the non-smooth objective. These approaches don't place parametric assumptions on nuisance functions, but they either require the computation of intractably-many nuisance estimates, assume unrealistic $L^\infty$ nuisance convergence rates, or make strong margin assumptions that prohibit non-response to a treatment. In this paper, we revisit the problem of constructing smooth approximations of non-differentiable functionals. By carefully controlling first-order bias and second-order remainders, we show that a softmax smoothing-based estimator can be used to estimate parameters that are specified as a maximum of scores involving nuisance components. In particular, this includes the value of the optimal treatment policy as a special case. Our estimator obtains $\sqrt{n}$ convergence rates, avoids parametric restrictions/unrealistic margin assumptions, and is often statistically efficient.

View on arXiv PDF

Similar