OCNAAPNAMar 12

Operator Splitting, Policy Iteration, and Machine Learning for Stochastic Optimal Control

arXiv:2603.1216713.11 citationsh-index: 10
Predicted impact top 92% in OC · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses computational challenges in stochastic optimal control, offering improved convergence rates for specific data types, but it is incremental as it builds on existing splitting and policy iteration methods.

The paper tackles solving the second-order Hamilton-Jacobi equation for stochastic optimal control by proposing a splitting method that reduces it to a heat step and a first-order step, with convergence rates such as an L∞ error of O(h^{1/3}) for C² data and exponential convergence in the first-order step.

We propose a splitting approach to solve the second-order Hamilton--Jacobi equation, reducing it to a heat step and a purely first-order step. The latter is implemented using a gradient value policy iteration algorithm, enabling efficient characteristic-based machine learning methods. We establish convergence rates for the splitting method. In particular, the $L^\infty$ error is bounded below by $\mathcal{O}(h)$ and above by $\mathcal{O}(h^{1/7})$ for Lipschitz initial data; this improves to $\mathcal{O}(h^{1/5})$ for semiconcave data and to $\mathcal{O}(h^{1/3})$ for $C^2$ data. We also prove an upper $L^1$ error estimate of order $\mathcal{O}(h^{1/2})$ in the periodic setting, where $h$ is the splitting step. For the first-order step, we provide a weighted $L^2$ error analysis that shows exponential convergence. Each iteration solves linear characteristic equations and learns the value function by minimizing a weighted value gradient loss. The approach yields stable and accurate numerical results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes