Operator Splitting, Policy Iteration, and Machine Learning for Stochastic Optimal Control

Alain Bensoussan, Thien P. B. Nguyen, Minh-Binh Tran, Son N. T. Tu

arXiv:2603.1216713.11 citationsh-index: 10

Predicted impact top 92% in OC · last 90 daysOriginality Incremental advance

AI Analysis

This work addresses computational challenges in stochastic optimal control, offering improved convergence rates for specific data types, but it is incremental as it builds on existing splitting and policy iteration methods.

The paper tackles solving the second-order Hamilton-Jacobi equation for stochastic optimal control by proposing a splitting method that reduces it to a heat step and a first-order step, with convergence rates such as an L∞ error of O(h^{1/3}) for C² data and exponential convergence in the first-order step.

We propose a splitting approach to solve the second-order Hamilton--Jacobi equation, reducing it to a heat step and a purely first-order step. The latter is implemented using a gradient value policy iteration algorithm, enabling efficient characteristic-based machine learning methods. We establish convergence rates for the splitting method. In particular, the $L^\infty$ error is bounded below by $\mathcal{O}(h)$ and above by $\mathcal{O}(h^{1/7})$ for Lipschitz initial data; this improves to $\mathcal{O}(h^{1/5})$ for semiconcave data and to $\mathcal{O}(h^{1/3})$ for $C^2$ data. We also prove an upper $L^1$ error estimate of order $\mathcal{O}(h^{1/2})$ in the periodic setting, where $h$ is the splitting step. For the first-order step, we provide a weighted $L^2$ error analysis that shows exponential convergence. Each iteration solves linear characteristic equations and learns the value function by minimizing a weighted value gradient loss. The approach yields stable and accurate numerical results.

View on arXiv PDF

Similar