OC LGApr 7

An Actor-Critic Framework for Continuous-Time Jump-Diffusion Controls with Normalizing Flows

arXiv:2604.0539878.4

AI Analysis

This addresses computational challenges in finance and economics for optimal control under complex stochastic dynamics, though it appears incremental as an extension of actor-critic methods to this specific setting.

The authors tackled the problem of computing optimal policies for continuous-time stochastic control with time-inhomogeneous jump-diffusion dynamics, which is difficult due to explicit time dependence, discontinuous shocks, and high dimensionality. They proposed an actor-critic framework with normalizing flows, demonstrating stable learning under jump discontinuities, accurate approximation of optimal policies, and favorable scaling with dimension and number of agents in validation on linear-quadratic control, Merton portfolio optimization, and multi-agent portfolio games.

Continuous-time stochastic control with time-inhomogeneous jump-diffusion dynamics is central in finance and economics, but computing optimal policies is difficult under explicit time dependence, discontinuous shocks, and high dimensionality. We propose an actor-critic framework that serves as a mesh-free solver for entropy-regularized control problems and stochastic games with jumps. The approach is built on a time-inhomogeneous little q-function and an appropriate occupation measure, yielding a policy-gradient representation that accommodates time-dependent drift, volatility, and jump terms. To represent expressive stochastic policies in continuous-action spaces, we parameterize the actor using conditional normalizing flows, enabling flexible non-Gaussian policies while retaining exact likelihood evaluation for entropy regularization and policy optimization. We validate the method on time-inhomogeneous linear-quadratic control, Merton portfolio optimization, and a multi-agent portfolio game, using explicit solutions or high-accuracy benchmarks. Numerical results demonstrate stable learning under jump discontinuities, accurate approximation of optimal stochastic policies, and favorable scaling with respect to dimension and number of agents.

View on arXiv PDF

Similar