Christoph Reisinger

OC
h-index13
36papers
222citations
Novelty58%
AI Score50

36 Papers

NAJan 20, 2016
Piecewise Constant Policy Approximations to Hamilton-Jacobi-Bellman Equations

Christoph Reisinger, Peter Forsyth

An advantageous feature of piecewise constant policy timestepping for Hamilton-Jacobi-Bellman (HJB) equations is that different linear approximation schemes, and indeed different meshes, can be used for the resulting linear equations for different control parameters. Standard convergence analysis suggests that monotone (i.e., linear) interpolation must be used to transfer data between meshes. Using the equivalence to a switching system and an adaptation of the usual arguments based on consistency, stability and monotonicity, we show that if limited, potentially higher order interpolation is used for the mesh transfer, convergence is guaranteed. We provide numerical tests for the mean-variance optimal investment problem and the uncertain volatility option pricing model, and compare the results to published test cases.

NAFeb 13, 2018
Multilevel simulation of functionals of Bernoulli random variables with application to basket credit derivatives

Karolina Bujok, Ben Hambly, Christoph Reisinger

We consider $N$ Bernoulli random variables, which are independent conditional on a common random factor determining their probability distribution. We show that certain expected functionals of the proportion $L_N$ of variables in a given state converge at rate $1/N$ as $N\rightarrow \infty$. Based on these results, we propose a multi-level simulation algorithm using a family of sequences with increasing length, to obtain estimators for these expected functionals with a mean-square error of $ε^2$ and computational complexity of order $ε^{-2}$, independent of $N$. In particular, this optimal complexity order also holds for the infinite-dimensional limit. Numerical examples are presented for tranche spreads of basket credit derivatives.

NAApr 6, 2012
Stochastic finite differences and multilevel Monte Carlo for a class of SPDEs in finance

Michael B. Giles, Christoph Reisinger

In this article, we propose a Milstein finite difference scheme for a stochastic partial differential equation (SPDE) describing a large particle system. We show, by means of Fourier analysis, that the discretisation on an unbounded domain is convergent of first order in the timestep and second order in the spatial grid size, and that the discretisation is stable with respect to boundary data. Numerical experiments clearly indicate that the same convergence order also holds for boundary-value problems. Multilevel path simulation, previously used for SDEs, is shown to give substantial complexity gains compared to a standard discretisation of the SPDE or direct simulation of the particle system. We derive complexity bounds and illustrate the results by an application to basket credit derivatives.

CPSep 17, 2011
On the Use of Policy Iteration as an Easy Way of Pricing American Options

Christoph Reisinger, Jan Hendrik Witte

In this paper, we demonstrate that policy iteration, introduced in the context of HJB equations in [Forsyth & Labahn, 2007], is an extremely simple generic algorithm for solving linear complementarity problems resulting from the finite difference and finite element approximation of American options. We show that, in general, O(N) is an upper and lower bound on the number of iterations needed to solve a discrete LCP of size N. If embedded in a class of standard discretisations with M time steps, the overall complexity of American option pricing is indeed only O(N(M+N)), and, therefore, for M N, identical to the pricing of European options, which is O(MN). We also discuss the numerical properties and robustness with respect to model parameters in relation to penalty and projected relaxation methods.

CPNov 30, 2010
A Penalty Method for the Numerical Solution of Hamilton-Jacobi-Bellman (HJB) Equations in Finance

Jan Hendrik Witte, Christoph Reisinger

We present a simple and easy to implement method for the numerical solution of a rather general class of Hamilton-Jacobi-Bellman (HJB) equations. In many cases, the considered problems have only a viscosity solution, to which, fortunately, many intuitive (e.g. finite difference based) discretisations can be shown to converge. However, especially when using fully implicit time stepping schemes with their desirable stability properties, one is still faced with the considerable task of solving the resulting nonlinear discrete system. In this paper, we introduce a penalty method which approximates the nonlinear discrete system to first order in the penalty parameter, and we show that an iterative scheme can be used to solve the penalised discrete problem in finitely many steps. We include a number of examples from mathematical finance for which the described approach yields a rigorous numerical scheme and present numerical results.

CPDec 1, 2011
Penalty Methods for the Solution of Discrete HJB Equations -- Continuous Control and Obstacle Problems

Jan Hendrik Witte, Christoph Reisinger

In this paper, we present a novel penalty approach for the numerical solution of continuously controlled HJB equations and HJB obstacle problems. Our results include estimates of the penalisation error for a class of penalty terms, and we show that variations of Newton's method can be used to obtain globally convergent iterative solvers for the penalised equations. Furthermore, we discuss under what conditions local quadratic convergence of the iterative solvers can be expected. We include numerical results demonstrating the competitiveness of our methods.

NAMar 10, 2018
Approximation schemes for mixed optimal stopping and control problems with nonlinear expectations and jumps

Roxana Dumitrescu, Christoph Reisinger, Yufei Zhang

We propose a class of numerical schemes for mixed optimal stopping and control of processes with infinite activity jumps and where the objective is evaluated by a nonlinear expectation. Exploiting an approximation by switching systems, piecewise constant policy timestepping reduces the problem to nonlocal semi-linear equations with different control parameters, uncoupled over individual time steps, which we solve by fully implicit monotone approximations to the controlled diffusion and the nonlocal term, and specifically the Lax-Friedrichs scheme for the nonlinearity in the gradient. We establish a comparison principle for the switching system and demonstrate the convergence of the schemes, which subsequently gives a constructive proof for the existence of a solution to the switching system. Numerical experiments are presented for a recursive utility maximization problem to demonstrate the effectiveness of the new schemes.

NAAug 25, 2018
Semi-analytical solution of a McKean-Vlasov equation with feedback through hitting a boundary

Alexander Lipton, Vadim Kaushansky, Christoph Reisinger

In this paper, we study the non-linear diffusion equation associated with a particle system where the common drift depends on the rate of absorption of particles at a boundary. We provide an interpretation as a structural credit risk model with default contagion in a large interconnected banking system. Using the method of heat potentials, we derive a coupled system of Volterra integral equations for the transition density and for the loss through absorption. An approximation by expansion is given for a small interaction parameter. We also present a numerical solution algorithm and conduct computational tests.

NANov 15, 2016
High-order filtered schemes for time-dependent second order HJB equations

Olivier Bokanowski, Athena Picarelli, Christoph Reisinger

In this paper, we present and analyse a class of "filtered" numerical schemes for second order Hamilton-Jacobi-Bellman equations. Our approach follows the ideas introduced in B.D. Froese and A.M. Oberman, Convergent filtered schemes for the Monge-Ampère partial differential equation, SIAM J. Numer. Anal., 51(1):423--444, 2013, and more recently applied by other authors to stationary or time-dependent first order Hamilton-Jacobi equations. For high order approximation schemes (where "high" stands for greater than one), the inevitable loss of monotonicity prevents the use of the classical theoretical results for convergence to viscosity solutions. The work introduces a suitable local modification of these schemes by "filtering" them with a monotone scheme, such that they can be proven convergent and still show an overall high order behaviour for smooth enough solutions. We give theoretical proofs of these claims and illustrate the behaviour with numerical tests from mathematical finance, focussing also on the use of backward difference formulae (BDF) for constructing the high order schemes.

NANov 7, 2016
Boundary Treatment and Multigrid Preconditioning for Semi-Lagrangian Schemes Applied to Hamilton-Jacobi-Bellman Equations

Christoph Reisinger, Julen Rotaetxe Arto

We analyse two practical aspects that arise in the numerical solution of Hamilton-Jacobi-Bellman (HJB) equations by a particular class of monotone approximation schemes known as semi-Lagrangian schemes. These schemes make use of a wide stencil to achieve convergence and result in discretization matrices that are less sparse and less local than those coming from standard finite difference schemes. This leads to computational difficulties not encountered there. In particular, we consider the overstepping of the domain boundary and analyse the accuracy and stability of stencil truncation. This truncation imposes a stricter CFL condition for explicit schemes in the vicinity of boundaries than in the interior, such that implicit schemes become attractive. We then study the use of geometric, algebraic and aggregation-based multigrid preconditioners to solve the resulting discretised systems from implicit time stepping schemes efficiently. Finally, we illustrate the performance of these techniques numerically for benchmark test cases from the literature.

NAMay 29, 2018
Simulation of particle systems interacting through hitting times

Vadim Kaushansky, Christoph Reisinger

We develop an Euler-type particle method for the simulation of a McKean--Vlasov equation arising from a mean-field model with positive feedback from hitting a boundary. Under assumptions on the parameters which ensure differentiable solutions, we establish convergence of order $1/2$ in the time step. Moreover, we give a modification of the scheme using Brownian bridges and local mesh refinement, which improves the order to $1$. We confirm our theoretical results with numerical tests and empirically investigate cases with blow-up.

OCMar 22, 2022
Linear convergence of a policy gradient method for some finite horizon continuous time control problems

Christoph Reisinger, Wolfgang Stockinger, Yufei Zhang

Despite its popularity in the reinforcement learning community, a provably convergent policy gradient method for continuous space-time control problems with nonlinear state dynamics has been elusive. This paper proposes proximal gradient algorithms for feedback controls of finite-time horizon stochastic control problems. The state dynamics are nonlinear diffusions with control-affine drift, and the cost functions are nonconvex in the state and nonsmooth in the control. The system noise can degenerate, which allows for deterministic control problems as special cases. We prove under suitable conditions that the algorithm converges linearly to a stationary point of the control problem, and is stable with respect to policy updates by approximate gradient steps. The convergence result justifies the recent reinforcement learning heuristics that adding entropy regularization or a fictitious discount factor to the optimization objective accelerates the convergence of policy gradient methods. The proof exploits careful regularity estimates of backward stochastic differential equations.

NAAug 2, 2012
Mean-square stability and error analysis of implicit time-stepping schemes for linear parabolic SPDEs with multiplicative Wiener noise in the first derivative

Christoph Reisinger

In this article, we extend a Milstein finite difference scheme introduced in [Giles & Reisinger(2011)] for a certain linear stochastic partial differential equation (SPDE), to semi- and fully implicit timestepping as introduced by [Szpruch(2010)] for SDEs. We combine standard finite difference Fourier analysis for PDEs with the linear stability analysis in [Buckwar & Sickenberger(2011)] for SDEs, to analyse the stability and accuracy. The results show that Crank-Nicolson timestepping for the principal part of the drift with a partially implicit but negatively weighted double Itô integral gives unconditional stability over all parameter values, and converges with the expected order in the mean-square sense. This opens up the possibility of local mesh refinement in the spatial domain, and we show experimentally that this can be beneficial in the presence of reduced regularity at boundaries.

NAMay 24, 2016
A partial Fourier transform method for a class of hypoelliptic Kolmogorov equations

Christoph Reisinger, Endre Süli, Alan Whitley

We consider hypoelliptic Kolmogorov equations in $n+1$ spatial dimensions, with $n\geq 1$, where the differential operator in the first $n$ spatial variables featuring in the equation is second-order elliptic, and with respect to the $(n+1)$st spatial variable the equation contains a pure transport term only and is therefore first-order hyperbolic. If the two differential operators, in the first $n$ and in the $(n+1)$st co-ordinate directions, do not commute, we benefit from hypoelliptic regularization in time, and the solution for $t>0$ is smooth even for a Dirac initial datum prescribed at $t=0$. We study specifically the case where the coefficients depend only on the first $n$ variables. In that case, a Fourier transform in the last variable and standard central finite difference approximation in the other variables can be applied for the numerical solution. We prove second-order convergence in the spatial mesh size for the model hypoelliptic equation $\frac{\partial u}{\partial t} + x \frac{\partial u}{\partial y} = \frac{\partial^2 u}{\partial x^2}$ subject to the initial condition $u(x,y,0) = δ(x) δ(y)$, with $(x,y) \in \mathbb{R} \times\mathbb{R}$ and $t>0$, proposed by Kolmogorov, and for an extension with $n=2$. We also demonstrate exponential convergence of an approximation of the inverse Fourier transform based on the trapezium rule. Lastly, we apply the method to a PDE arising in mathematical finance, which models the distribution of the hedging error under a mis-specified derivative pricing model.

CPFeb 28, 2019
A numerical scheme for the quantile hedging problem

Cyril Bénézet, Jean-François Chassagneux, Christoph Reisinger

We consider the numerical approximation of the quantile hedging price in a non-linear market. In a Markovian framework, we propose a numerical method based on a Piecewise Constant Policy Timestepping (PCPT) scheme coupled with a monotone finite difference approximation. We prove the convergence of our algorithm combining BSDE arguments with the Barles & Jakobsen and Barles & Souganidis approaches for non-linear equations. In a numerical section, we illustrate the efficiency of our scheme by considering a financial example in a market with imperfections.

NANov 28, 2018
Stability and error analysis of an implicit Milstein finite difference scheme for a two-dimensional Zakai SPDE

Christoph Reisinger, Zhenru Wang

In this article, we propose an implicit finite difference scheme for a two-dimensional parabolic stochastic partial differential equation (SPDE) of Zakai type. The scheme is based on a Milstein approximation to the stochastic integral and an alternating direction implicit (ADI) discretisation of the elliptic term. We prove its mean-square stability and convergence in L2 of first order in time and second order in space, by Fourier analysis, in the presence of Dirac initial data. Numerical tests confirm these findings empirically.

NADec 8, 2016
Analysis of Multi-Index Monte Carlo Estimators for a Zakai SPDE

Zhenru Wang, Christoph Reisinger

In this article, we propose a space-time Multi-Index Monte Carlo (MIMC) estimator for a one-dimensional parabolic stochastic partial differential equation (SPDE) of Zakai type. We compare the complexity with the Multilevel Monte Carlo (MLMC) method of Giles and Reisinger (2012), and find, by means of Fourier analysis, that the MIMC method: (i) has suboptimal complexity of $O(\varepsilon^{-2}|\log\varepsilon|^3)$ for a root mean square error (RMSE) $\varepsilon$ if the same spatial discretisation as in the MLMC method is used; (ii) has a better complexity of $O(\varepsilon^{-2}|\log\varepsilon|)$ if a carefully adapted discretisation is used; (iii) has to be adapted for non-smooth functionals. Numerical tests confirm these findings empirically.

PRNov 7, 2016
The non-locality of Markov chain approximations to two-dimensional diffusions

Christoph Reisinger

In this short paper, we consider discrete-time Markov chains on lattices as approximations to continuous-time diffusion processes. The approximations can be interpreted as finite difference schemes for the generator of the process. We derive conditions on the diffusion coefficients which permit transition probabilities to match locally first and second moments. We derive a novel formula which expresses how the matching becomes more difficult for larger (absolute) correlations and strongly anisotropic processes, such that instantaneous moves to more distant neighbours on the lattice have to be allowed. Roughly speaking, for non-zero correlations, the distance covered in one timestep is proportional to the ratio of volatilities in the two directions. We discuss the implications to Markov decision processes and the convergence analysis of approximations to Hamilton-Jacobi-Bellman equations in the Barles-Souganidis framework.

NAMay 16, 2018
A penalty scheme and policy iteration for nonlocal HJB variational inequalities with monotone drivers

Christoph Reisinger, Yufei Zhang

We propose a class of numerical schemes for nonlocal HJB variational inequalities (HJBVIs) with monotone drivers. The solution and free boundary of the HJBVI are constructed from a sequence of penalized equations, for which a continuous dependence result is derived and the penalization error is estimated. The penalized equation is then discretized by a class of semi-implicit monotone approximations. We present a novel analysis technique for the well-posedness of the discrete equation, and demonstrate the convergence of the scheme, which subsequently gives a constructive proof for the existence of a solution to the penalized equation and variational inequality. We further propose an efficient iterative algorithm with local superlinear convergence for solving the discrete equation. Numerical experiments are presented for an optimal investment problem under ambiguity and a recursive consumption-portfolio allocation problem.

OCNov 1, 2022
Convergence of policy gradient methods for finite-horizon exploratory linear-quadratic control problems

Michael Giegrich, Christoph Reisinger, Yufei Zhang

We study the global linear convergence of policy gradient (PG) methods for finite-horizon continuous-time exploratory linear-quadratic control (LQC) problems. The setting includes stochastic LQC problems with indefinite costs and allows additional entropy regularisers in the objective. We consider a continuous-time Gaussian policy whose mean is linear in the state variable and whose covariance is state-independent. Contrary to discrete-time problems, the cost is noncoercive in the policy and not all descent directions lead to bounded iterates. We propose geometry-aware gradient descents for the mean and covariance of the policy using the Fisher geometry and the Bures-Wasserstein geometry, respectively. The policy iterates are shown to satisfy an a-priori bound, and converge globally to the optimal policy with a linear rate. We further propose a novel PG method with discrete-time policies. The algorithm leverages the continuous-time analysis, and achieves a robust linear convergence across different action frequencies. A numerical experiment confirms the convergence and robustness of the proposed algorithm.

NAFeb 20, 2018
Stability and convergence of second order backward differentiation schemes for parabolic Hamilton-Jacobi-Bellman equations

Olivier Bokanowski, Athena Picarelli, Christoph Reisinger

We study a second order BDF (Backward Differentiation Formula) scheme for the numerical approximation of parabolic HJB (Hamilton-Jacobi-Bellman) equations. The scheme under consideration is implicit, non-monotone, and second order accurate in time and space. The lack of monotonicity prevents the use of well-known convergence results for solutions in the viscosity sense. In this work, we establish rigorous stability results in a general nonlinear setting as well as convergence results for some particular cases with additional regularity assumptions. While most results are presented for one-dimensional, linear parabolic and non-linear HJB equations, some results are also extended to multiple dimensions and to Isaacs equations. Numerical tests are included to validate the method.

CPOct 3, 2013
Numerical Valuation of Derivatives in High-Dimensional Settings via PDE Expansions

Christoph Reisinger, Rasmus Wissmann

In this article, we propose a new numerical approach to high-dimensional partial differential equations (PDEs) arising in the valuation of exotic derivative securities. The proposed method is extended from Reisinger and Wittum (2007) and uses principal component analysis (PCA) of the underlying process in combination with a Taylor expansion of the value function into solutions to low-dimensional PDEs. The approximation is related to anchored analysis of variance (ANOVA) decompositions and is expected to be accurate whenever the covariance matrix has one or few dominating eigenvalues. A main purpose of the present article is to give a careful analysis of the numerical accuracy and computational complexity compared to state-of-the-art Monte Carlo methods on the example of Bermudan swaptions and Ratchet floors, which are considered difficult benchmark problems. We are able to demonstrate that for problems with medium to high dimensionality and moderate time horizons the presented PDE method delivers results comparable in accuracy to the MC methods considered here in similar or (often significantly) faster runtime.

MLJun 7, 2023
$K$-Nearest-Neighbor Resampling for Off-Policy Evaluation in Stochastic Control

Michael Giegrich, Roel Oomen, Christoph Reisinger

In this paper, we propose a novel $K$-nearest neighbor resampling procedure for estimating the performance of a policy from historical data containing realized episodes of a decision process generated under a different policy. We provide statistical consistency results under weak conditions. In particular, we avoid the common assumption of identically and independently distributed transitions and rewards. Instead, our analysis allows for the sampling of entire episodes, as is common practice in most applications. To establish the consistency in this setting, we generalize Stone's Theorem, a well-known result in nonparametric statistics on local averaging, to include episodic data and the counterfactual estimation underlying off-policy evaluation (OPE). By focusing on feedback policies that depend deterministically on the current state in environments with continuous state-action spaces and system-inherent stochasticity effected by chosen actions, and relying on trajectory simulation similar to Monte Carlo methods, the proposed method is particularly well suited for stochastic control environments. Compared to other OPE methods, our algorithm does not require optimization, can be efficiently implemented via tree-based nearest neighbor search and parallelization, and does not explicitly assume a parametric model for the environment's dynamics. Numerical experiments demonstrate the effectiveness of the algorithm compared to existing baselines in a variety of stochastic control settings, including a linear quadratic regulator, trade execution in limit order books, and online stochastic bin packing.

TRSep 10, 2024
Limit Order Book Simulation and Trade Evaluation with $K$-Nearest-Neighbor Resampling

Michael Giegrich, Roel Oomen, Christoph Reisinger

In this paper, we show how $K$-nearest neighbor ($K$-NN) resampling, an off-policy evaluation method proposed in \cite{giegrich2023k}, can be applied to simulate limit order book (LOB) markets and how it can be used to evaluate and calibrate trading strategies. Using historical LOB data, we demonstrate that our simulation method is capable of recreating realistic LOB dynamics and that synthetic trading within the simulation leads to a market impact in line with the corresponding literature. Compared to other statistical LOB simulation methods, our algorithm has theoretical convergence guarantees under general conditions, does not require optimization, is easy to implement and computationally efficient. Furthermore, we show that in a benchmark comparison our method outperforms a deep learning-based algorithm for several key statistics. In the context of a LOB with pro-rata type matching, we demonstrate how our algorithm can calibrate the size of limit orders for a liquidation strategy. Finally, we describe how $K$-NN resampling can be modified for choices of higher dimensional state spaces.

MLMay 11
Uniform Scaling Limits in AdamW-Trained Transformers

William Gibson, Christoph Reisinger

We study the large-depth limit of transformers trained with AdamW, by modelling the hidden-state dynamics as an interacting particle system (IPS) coupled through the attention mechanism. Under appropriate scaling of the attention heads, we prove that the joint dynamics of the hidden states and backpropagated variables converge in $L^2$, uniformly over the initial condition, to the solution of a forward--backward system of ODEs at rate $\mathcal O(L^{-1}+L^{-1/3}H^{-1/2})$. Here, $L$ and $H$ denote the depth and number of heads of the transformer, respectively. The limiting system of ODEs can be identified with a McKean--Vlasov ODE (MVODE) when the attention heads do not incorporate causal masking. By using the flow maps associated with this MVODE and applying concentration of measure techniques, we obtain bounds on the difference between the discrete and continuous models that are uniform over compact sets of initial conditions. As this is achieved without resorting to a covering argument, the constants in our bounds are independent of the number of tokens. Furthermore, under a suitable adaptation to AdamW, the bounds become independent of the token embedding dimension.

NAJun 8, 2011
Analysis of Linear Difference Schemes in the Sparse Grid Combination Technique

Christoph Reisinger

Sparse grids are tailored to the approximation of smooth high-dimensional functions. On a $d$-dimensional tensor product space, the number of grid points is $N = \mathcal O(h^{-1} |\log h|^{d-1})$, where $h$ is a mesh parameter. The so-called combination technique, based on hierarchical decomposition and extrapolation, requires specific multivariate error expansions of the discretisation error on Cartesian grids to hold. We derive such error expansions for linear difference schemes through an error correction technique of semi-discretisations. We obtain overall error formulae of the type $ε= \mathcal{O} (h^p |\log h|^{d-1})$ and analyse the convergence, with its dependence on dimension and smoothness, by examples of linear elliptic and parabolic problems, with numerical illustrations in up to eight dimensions.

OCJan 16
Model-free policy gradient for discrete-time mean-field control

Matthieu Meunier, Huyên Pham, Christoph Reisinger

We study model-free policy learning for discrete-time mean-field control (MFC) problems with finite state space and compact action space. In contrast to the extensive literature on value-based methods for MFC, policy-based approaches remain largely unexplored due to the intrinsic dependence of transition kernels and rewards on the evolving population state distribution, which prevents the direct use of likelihood-ratio estimators of policy gradients from classical single-agent reinforcement learning. We introduce a novel perturbation scheme on the state-distribution flow and prove that the gradient of the resulting perturbed value function converges to the true policy gradient as the perturbation magnitude vanishes. This construction yields a fully model-free estimator based solely on simulated trajectories and an auxiliary estimate of the sensitivity of the state distribution. Building on this framework, we develop MF-REINFORCE, a model-free policy gradient algorithm for MFC, and establish explicit quantitative bounds on its bias and mean-squared error. Numerical experiments on representative mean-field control tasks demonstrate the effectiveness of the proposed approach.

LGJul 29, 2025
Weighted Conditional Flow Matching

Sergio Calvo-Ordonez, Matthieu Meunier, Alvaro Cartea et al.

Conditional flow matching (CFM) has emerged as a powerful framework for training continuous normalizing flows due to its computational efficiency and effectiveness. However, standard CFM often produces paths that deviate significantly from straight-line interpolations between prior and target distributions, making generation slower and less accurate due to the need for fine discretization at inference. Recent methods enhance CFM performance by inducing shorter and straighter trajectories but typically rely on computationally expensive mini-batch optimal transport (OT). Drawing insights from entropic optimal transport (EOT), we propose Weighted Conditional Flow Matching (W-CFM), a novel approach that modifies the classical CFM loss by weighting each training pair $(x, y)$ with a Gibbs kernel. We show that this weighting recovers the entropic OT coupling up to some bias in the marginals, and we provide the conditions under which the marginals remain nearly unchanged. Moreover, we establish an equivalence between W-CFM and the minibatch OT method in the large-batch limit, showing how our method overcomes computational and performance bottlenecks linked to batch size. Empirically, we test our method on unconditional generation on various synthetic and real datasets, confirming that W-CFM achieves comparable or superior sample quality, fidelity, and diversity to other alternative baselines while maintaining the computational efficiency of vanilla CFM.

LGMar 27, 2025
Efficient Learning for Entropy-Regularized Markov Decision Processes via Multilevel Monte Carlo

Matthieu Meunier, Christoph Reisinger, Yufei Zhang

Designing efficient learning algorithms with complexity guarantees for Markov decision processes (MDPs) with large or continuous state and action spaces remains a fundamental challenge. We address this challenge for entropy-regularized MDPs with Polish state and action spaces, assuming access to a generative model of the environment. We propose a novel family of multilevel Monte Carlo (MLMC) algorithms that integrate fixed-point iteration with MLMC techniques and a generic stochastic approximation of the Bellman operator. We quantify the precise impact of the chosen approximate Bellman operator on the accuracy of the resulting MLMC estimator. Leveraging this error analysis, we show that using a biased plain MC estimate for the Bellman operator results in quasi-polynomial sample complexity, whereas an unbiased randomized multilevel approximation of the Bellman operator achieves polynomial sample complexity in expectation. Notably, these complexity bounds are independent of the dimensions or cardinalities of the state and action spaces, distinguishing our approach from existing algorithms whose complexities scale with the sizes of these spaces. We validate these theoretical performance guarantees through numerical experiments.

CPFeb 15, 2022
Estimating risks of option books using neural-SDE market models

Samuel N. Cohen, Christoph Reisinger, Sheng Wang

In this paper, we examine the capacity of an arbitrage-free neural-SDE market model to produce realistic scenarios for the joint dynamics of multiple European options on a single underlying. We subsequently demonstrate its use as a risk simulation engine for option portfolios. Through backtesting analysis, we show that our models are more computationally efficient and accurate for evaluating the Value-at-Risk (VaR) of option portfolios, with better coverage performance and less procyclicality than standard filtered historical simulation approaches.

CPMay 24, 2021
Arbitrage-free neural-SDE market models

Samuel N. Cohen, Christoph Reisinger, Sheng Wang

Modelling joint dynamics of liquid vanilla options is crucial for arbitrage-free pricing of illiquid derivatives and managing risks of option trade books. This paper develops a nonparametric model for the European options book respecting underlying financial constraints and while being practically implementable. We derive a state space for prices which are free from static (or model-independent) arbitrage and study the inference problem where a model is learnt from discrete time series data of stock and option prices. We use neural networks as function approximators for the drift and diffusion of the modelled SDE system, and impose constraints on the neural nets such that no-arbitrage conditions are preserved. In particular, we give methods to calibrate \textit{neural SDE} models which are guaranteed to satisfy a set of linear inequalities. We validate our approach with numerical experiments using data generated from a Heston stochastic local volatility model.

LGJun 24, 2020
Understanding Deep Architectures with Reasoning Layer

Xinshi Chen, Yufei Zhang, Christoph Reisinger et al.

Recently, there has been a surge of interest in combining deep learning models with reasoning in order to handle more sophisticated learning tasks. In many cases, a reasoning task can be solved by an iterative algorithm. This algorithm is often unrolled, and used as a specialized layer in the deep architecture, which can be trained end-to-end with other neural components. Although such hybrid deep architectures have led to many empirical successes, the theoretical foundation of such architectures, especially the interplay between algorithm layers and other neural layers, remains largely unexplored. In this paper, we take an initial step towards an understanding of such hybrid deep architectures by showing that properties of the algorithm layers, such as convergence, stability, and sensitivity, are intimately related to the approximation and generalization abilities of the end-to-end model. Furthermore, our analysis matches closely our experimental observations under various conditions, suggesting that our theory can provide useful guidelines for designing deep architectures with reasoning layers.

OCJan 9, 2020
Regularity and stability of feedback relaxed controls

Christoph Reisinger, Yufei Zhang

This paper proposes a relaxed control regularization with general exploration rewards to design robust feedback controls for multi-dimensional continuous-time stochastic exit time problems. We establish that the regularized control problem admits a Hölder continuous feedback control, and demonstrate that both the value function and the feedback control of the regularized control problem are Lipschitz stable with respect to parameter perturbations. Moreover, we show that a pre-computed feedback relaxed control has a robust performance in a perturbed system, and derive a first-order sensitivity equation for both the value function and optimal feedback relaxed control. These stability results provide a theoretical justification for recent reinforcement learning heuristics that including an exploration reward in the optimization objective leads to more robust decision making. We finally prove first-order monotone convergence of the value functions for relaxed control problems with vanishing exploration parameters, which subsequently enables us to construct the pure exploitation strategy of the original control problem based on the feedback relaxed controls.

NAJun 5, 2019
A neural network based policy iteration algorithm with global $H^2$-superlinear convergence for stochastic games on domains

Kazufumi Ito, Christoph Reisinger, Yufei Zhang

In this work, we propose a class of numerical schemes for solving semilinear Hamilton-Jacobi-Bellman-Isaacs (HJBI) boundary value problems which arise naturally from exit time problems of diffusion processes with controlled drift. We exploit policy iteration to reduce the semilinear problem into a sequence of linear Dirichlet problems, which are subsequently approximated by a multilayer feedforward neural network ansatz. We establish that the numerical solutions converge globally in the $H^2$-norm, and further demonstrate that this convergence is superlinear, by interpreting the algorithm as an inexact Newton iteration for the HJBI equation. Moreover, we construct the optimal feedback controls from the numerical value functions and deduce convergence. The numerical schemes and convergence results are then extended to HJBI boundary value problems corresponding to controlled diffusion processes with oblique boundary reflection. Numerical experiments on the stochastic Zermelo navigation problem are presented to illustrate the theoretical results and to demonstrate the effectiveness of the method.

NAMar 15, 2019
Rectified deep neural networks overcome the curse of dimensionality for nonsmooth value functions in zero-sum games of nonlinear stiff systems

Christoph Reisinger, Yufei Zhang

In this paper, we establish that for a wide class of controlled stochastic differential equations (SDEs) with stiff coefficients, the value functions of corresponding zero-sum games can be represented by a deep artificial neural network (DNN), whose complexity grows at most polynomially in both the dimension of the state equation and the reciprocal of the required accuracy. Such nonlinear stiff systems may arise, for example, from Galerkin approximations of controlled stochastic partial differential equations (SPDEs), or controlled PDEs with uncertain initial conditions and source terms. This implies that DNNs can break the curse of dimensionality in numerical approximations and optimal control of PDEs and SPDEs. The main ingredient of our proof is to construct a suitable discrete-time system to effectively approximate the evolution of the underlying stochastic dynamics. Similar ideas can also be applied to obtain expression rates of DNNs for value functions induced by stiff systems with regime switching coefficients and driven by general Lévy noise.

NAApr 17, 2019
Analysis of sparse grid multilevel estimators for multi-dimensional Zakai equations

Christoph Reisinger, Zhenru Wang

In this article, we analyse the accuracy and computational complexity of estimators for expected functionals of the solution to multi-dimensional parabolic stochastic partial differential equations (SPDE) of Zakai-type. Here, we use the Milstein scheme for time integration and an alternating direction implicit (ADI) splitting of the spatial finite difference discretisation, coupled with the sparse grid combination technique and multilevel Monte Carlo sampling (MLMC). In the two-dimensional case, we find by detailed Fourier analysis that for a root-mean-square error (RMSE) $\varepsilon$, MLMC on sparse grids has the optimal complexity $O(\varepsilon^{-2})$, whereas MLMC on regular grids has $O(\varepsilon^{-2}(\log\varepsilon)^2)$, standard MC on sparse grids $O(\varepsilon^{-7/2}(|\log\varepsilon|)^{5/2})$, and MC on regular grids $O(\varepsilon^{-4})$. Numerical tests confirm these findings empirically. We give a discussion of the higher-dimensional setting without detailed proofs, which suggests that MLMC on sparse grids always leads to the optimal complexity, standard MC on sparse grids has a fixed complexity order independent of the dimension (up to a logarithmic term), whereas the cost of MLMC and MC on regular grids increases exponentially with the dimension.