Debiasing Conditional Stochastic Optimization
This addresses a fundamental bottleneck in CSO problems across various domains, offering improved efficiency for researchers and practitioners in fields like machine learning and finance.
The paper tackles the biased gradient problem in conditional stochastic optimization (CSO), which affects applications like portfolio selection and reinforcement learning, by introducing a stochastic extrapolation technique that reduces bias and achieves significantly better sample complexity than existing bounds for nonconvex smooth objectives.
In this paper, we study the conditional stochastic optimization (CSO) problem which covers a variety of applications including portfolio selection, reinforcement learning, robust learning, causal inference, etc. The sample-averaged gradient of the CSO objective is biased due to its nested structure, and therefore requires a high sample complexity for convergence. We introduce a general stochastic extrapolation technique that effectively reduces the bias. We show that for nonconvex smooth objectives, combining this extrapolation with variance reduction techniques can achieve a significantly better sample complexity than the existing bounds. Additionally, we develop new algorithms for the finite-sum variant of the CSO problem that also significantly improve upon existing results. Finally, we believe that our debiasing technique has the potential to be a useful tool for addressing similar challenges in other stochastic optimization problems.