NAFeb 22, 2019
Multilevel Picard iterations for solving smooth semilinear parabolic heat equationsWeinan E, Martin Hutzenthaler, Arnulf Jentzen et al.
We introduce a new family of numerical algorithms for approximating solutions of general high-dimensional semilinear parabolic partial differential equations at single space-time points. The algorithm is obtained through a delicate combination of the Feynman-Kac and the Bismut-Elworthy-Li formulas, and an approximate decomposition of the Picard fixed-point iteration with multilevel accuracy. The algorithm has been tested on a variety of semilinear partial differential equations that arise in physics and finance, with very satisfactory results. Analytical tools needed for the analysis of such algorithms, including a semilinear Feynman-Kac formula, a new class of semi-norms and their recursive inequalities, are also introduced. They allow us to prove for semilinear heat equations with gradient-independent nonlinearity that the computational complexity of the proposed algorithm is bounded by $O(d\,\varepsilon^{-(4+δ)})$ for any $δ\in (0,\infty)$ under suitable assumptions, where $d\in \mathbb{N}$ is the dimensionality of the problem and $\varepsilon\in(0,\infty)$ is the prescribed accuracy.
NANov 3, 2017
Multi-level Picard approximations of high-dimensional semilinear parabolic differential equations with gradient-dependent nonlinearitiesMartin Hutzenthaler, Thomas Kruse
Parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) have a wide range of applications. In particular, high-dimensional PDEs with gradient-dependent nonlinearities appear often in the state-of-the-art pricing and hedging of financial derivatives. In this article we prove that semilinear heat equations with gradient-dependent nonlinearities can be approximated under suitable assumptions with computational complexity that grows polynomially both in the dimension and the reciprocal of the accuracy.
NASep 24, 2023
Deep neural networks with ReLU, leaky ReLU, and softplus activation provably overcome the curse of dimensionality for Kolmogorov partial differential equations with Lipschitz nonlinearities in the $L^p$-senseJulia Ackermann, Arnulf Jentzen, Thomas Kruse et al.
Recently, several deep learning (DL) methods for approximating high-dimensional partial differential equations (PDEs) have been proposed. The interest that these methods have generated in the literature is in large part due to simulations which appear to demonstrate that such DL methods have the capacity to overcome the curse of dimensionality (COD) for PDEs in the sense that the number of computational operations they require to achieve a certain approximation accuracy $\varepsilon\in(0,\infty)$ grows at most polynomially in the PDE dimension $d\in\mathbb N$ and the reciprocal of $\varepsilon$. While there is thus far no mathematical result that proves that one of such methods is indeed capable of overcoming the COD, there are now a number of rigorous results in the literature that show that deep neural networks (DNNs) have the expressive power to approximate PDE solutions without the COD in the sense that the number of parameters used to describe the approximating DNN grows at most polynomially in both the PDE dimension $d\in\mathbb N$ and the reciprocal of the approximation accuracy $\varepsilon>0$. Roughly speaking, in the literature it is has been proved for every $T>0$ that solutions $u_d\colon [0,T]\times\mathbb R^d\to \mathbb R$, $d\in\mathbb N$, of semilinear heat PDEs with Lipschitz continuous nonlinearities can be approximated by DNNs with ReLU activation at the terminal time in the $L^2$-sense without the COD provided that the initial value functions $\mathbb R^d\ni x\mapsto u_d(0,x)\in\mathbb R$, $d\in\mathbb N$, can be approximated by ReLU DNNs without the COD. It is the key contribution of this work to generalize this result by establishing this statement in the $L^p$-sense with $p\in(0,\infty)$ and by allowing the activation function to be more general covering the ReLU, the leaky ReLU, and the softplus activation functions as special cases.
LGApr 28, 2025
Learning Brenier Potentials with Convex Generative Adversarial Neural NetworksClaudia Drygala, Hanno Gottschalk, Thomas Kruse et al.
Brenier proved that under certain conditions on a source and a target probability measure there exists a strictly convex function such that its gradient is a transport map from the source to the target distribution. This function is called the Brenier potential. Furthermore, detailed information on the Hölder regularity of the Brenier potential is available. In this work we develop the statistical learning theory of generative adversarial neural networks that learn the Brenier potential. As by the transformation of densities formula, the density of the generated measure depends on the second derivative of the Brenier potential, we develop the universal approximation theory of ReCU networks with cubic activation $\mathtt{ReCU}(x)=\max\{0,x\}^3$ that combines the favorable approximation properties of Hölder functions with a Lipschitz continuous density. In order to assure the convexity of such general networks, we introduce an adversarial training procedure for a potential function represented by the ReCU networks that combines the classical discriminator cross entropy loss with a penalty term that enforces (strict) convexity. We give a detailed decomposition of learning errors and show that for a suitable high penalty parameter all networks chosen in the adversarial min-max optimization problem are strictly convex. This is further exploited to prove the consistency of the learning procedure for (slowly) expanding network capacity. We also implement the described learning algorithm and apply it to a number of standard test cases from Gaussian mixture to image data as target distributions. As predicted in theory, we observe that the convexity loss becomes inactive during the training process and the potentials represented by the neural networks have learned convexity.
OCJun 28, 2025
Deep neural networks can provably solve Bellman equations for Markov decision processes without the curse of dimensionalityArnulf Jentzen, Konrad Kleinberg, Thomas Kruse
Discrete time stochastic optimal control problems and Markov decision processes (MDPs) are fundamental models for sequential decision-making under uncertainty and as such provide the mathematical framework underlying reinforcement learning theory. A central tool for solving MDPs is the Bellman equation and its solution, the so-called $Q$-function. In this article, we construct deep neural network (DNN) approximations for $Q$-functions associated to MDPs with infinite time horizon and finite control set $A$. More specifically, we show that if the the payoff function and the random transition dynamics of the MDP can be suitably approximated by DNNs with leaky rectified linear unit (ReLU) activation, then the solutions $Q_d\colon \mathbb R^d\to \mathbb R^{|A|}$, $d\in \mathbb{N}$, of the associated Bellman equations can also be approximated in the $L^2$-sense by DNNs with leaky ReLU activation whose numbers of parameters grow at most polynomially in both the dimension $d\in \mathbb{N}$ of the state space and the reciprocal $1/\varepsilon$ of the prescribed error $\varepsilon\in (0,1)$. Our proof relies on the recently introduced full-history recursive multilevel fixed-point (MLFP) approximation scheme.
NAAug 10, 2017
On multilevel Picard numerical approximations for high-dimensional nonlinear parabolic partial differential equations and high-dimensional nonlinear backward stochastic differential equationsWeinan E, Martin Hutzenthaler, Arnulf Jentzen et al.
Parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) are key ingredients in a number of models in physics and financial engineering. In particular, parabolic PDEs and BSDEs are fundamental tools in the state-of-the-art pricing and hedging of financial derivatives. The PDEs and BSDEs appearing in such applications are often high-dimensional and nonlinear. Since explicit solutions of such PDEs and BSDEs are typically not available, it is a very active topic of research to solve such PDEs and BSDEs approximately. In the recent article [E, W., Hutzenthaler, M., Jentzen, A., and Kruse, T. Linear scaling algorithms for solving high-dimensional nonlinear parabolic differential equations. arXiv:1607.03295 (2017)] we proposed a family of approximation methods based on Picard approximations and multilevel Monte Carlo methods and showed under suitable regularity assumptions on the exact solution for semilinear heat equations that the computational complexity is bounded by $O( d \, ε^{-(4+δ)})$ for any $δ\in(0,\infty)$, where $d$ is the dimensionality of the problem and $ε\in(0,\infty)$ is the prescribed accuracy. In this paper, we test the applicability of this algorithm on a variety of $100$-dimensional nonlinear PDEs that arise in physics and finance by means of numerical simulations presenting approximation accuracy against runtime. The simulation results for these 100-dimensional example PDEs are very satisfactory in terms of accuracy and speed. In addition, we also provide a review of other approximation methods for nonlinear PDEs and BSDEs from the literature.