PRMar 5, 2016
Branching diffusion representation of semilinear PDEs and Monte Carlo approximationPierre Henry-Labordere, Nadia Oudjane, Xiaolu Tan et al.
We provide a representation result of parabolic semi-linear PD-Es, with polynomial nonlinearity, by branching diffusion processes. We extend the classical representation for KPP equations, introduced by Skorokhod (1964), Watanabe (1965) and McKean (1975), by allowing for polynomial nonlinearity in the pair $(u, Du)$, where $u$ is the solution of the PDE with space gradient $Du$. Similar to the previous literature, our result requires a non-explosion condition which restrict to "small maturity" or "small nonlinearity" of the PDE. Our main ingredient is the automatic differentiation technique as in Henry Labordere, Tan and Touzi (2015), based on the Malliavin integration by parts, which allows to account for the nonlinearities in the gradient. As a consequence, the particles of our branching diffusion are marked by the nature of the nonlinearity. This new representation has very important numerical implications as it is suitable for Monte Carlo simulation. Indeed, this provides the first numerical method for high dimensional nonlinear PDEs with error estimate induced by the dimension-free Central limit theorem. The complexity is also easily seen to be of the order of the squared dimension. The final section of this paper illustrates the efficiency of the algorithm by some high dimensional numerical experiments.
MLSep 8, 2023
Actor critic learning algorithms for mean-field control with moment neural networksHuyên Pham, Xavier Warin
We develop a new policy gradient and actor-critic algorithm for solving mean-field control problems within a continuous time reinforcement learning setting. Our approach leverages a gradient-based representation of the value function, employing parametrized randomized policies. The learning for both the actor (policy) and critic (value function) is facilitated by a class of moment neural network functions on the Wasserstein space of probability measures, and the key feature is to sample directly trajectories of distributions. A central challenge addressed in this study pertains to the computational treatment of an operator specific to the mean-field framework. To illustrate the effectiveness of our methods, we provide a comprehensive set of numerical results. These encompass diverse examples, including multi-dimensional settings and nonlinear quadratic mean-field control problems with controlled volatility.
MLMar 20, 2023
Quantile and moment neural networks for learning functionals of distributionsXavier Warin
We study news neural networks to approximate function of distributions in a probability space. Two classes of neural networks based on quantile and moment approximation are proposed to learn these functions and are theoretically supported by universal approximation theorems. By mixing the quantile and moment features in other new networks, we develop schemes that outperform existing networks on numerical test cases involving univariate distributions. For bivariate distributions, the moment neural network outperforms all other networks.
47.0LGMay 4
P1-KAN: an effective Kolmogorov-Arnold network with application to hydraulic valley optimizationXavier Warin
A new Kolmogorov-Arnold network (KAN) is proposed to approximate potentially irregular functions in high dimensions. We provide error bounds for this approximation, assuming that the Kolmogorov-Arnold expansion functions are sufficiently smooth. When the function is only continuous, we also provide universal approximation theorems. We show that it outperforms multilayer perceptrons in terms of accuracy and convergence speed. We also compare it with several proposed KAN networks: it outperforms all networks for irregular functions and achieves similar accuracy to the original spline-based KAN network for smooth functions. Finally, we compare some of the KAN networks in optimizing a French hydraulic valley.
MLMay 27, 2025
Input Convex Kolmogorov Arnold NetworksThomas Deschatre, Xavier Warin
This article presents an input convex neural network architecture using Kolmogorov-Arnold networks (ICKAN). Two specific networks are presented: the first is based on a low-order, linear-by-part, representation of functions, and a universal approximation theorem is provided. The second is based on cubic splines, for which only numerical results support convergence. We demonstrate on simple tests that these networks perform competitively with classical input convex neural networks (ICNNs). In a second part, we use the networks to solve some optimal transport problems needing a convex approximation of functions and demonstrate their effectiveness. Comparisons with ICNNs show that cubic ICKANs produce results similar to those of classical ICNNs.
LGNov 5, 2024
A spectral mixture representation of isotropic kernels to generalize random Fourier featuresNicolas Langrené, Xavier Warin, Pierre Gruet
Rahimi and Recht (2007) introduced the idea of decomposing positive definite shift-invariant kernels by randomly sampling from their spectral distribution. This famous technique, known as Random Fourier Features (RFF), is in principle applicable to any such kernel whose spectral distribution can be identified and simulated. In practice, however, it is usually applied to the Gaussian kernel because of its simplicity, since its spectral distribution is also Gaussian. Clearly, simple spectral sampling formulas would be desirable for broader classes of kernels. In this paper, we show that the spectral distribution of positive definite isotropic kernels in $\mathbb{R}^{d}$ for all $d\geq1$ can be decomposed as a scale mixture of $α$-stable random vectors, and we identify the mixing distribution as a function of the kernel. This constructive decomposition provides a simple and ready-to-use spectral sampling formula for many multivariate positive definite shift-invariant kernels, including exponential power kernels, generalized Matérn kernels, generalized Cauchy kernels, as well as newly introduced kernels such as the Beta, Kummer, and Tricomi kernels. In particular, we retrieve the fact that the spectral distributions of these kernels are scale mixtures of the multivariate Gaussian distribution, along with an explicit mixing distribution formula. This result has broad applications for support vector machines, kernel ridge regression, Gaussian processes, and other kernel-based machine learning techniques for which the random Fourier features technique is applicable.
MLAug 3, 2025
Fast Gaussian process inference by exact Matérn kernel decompositionNicolas Langrené, Xavier Warin, Pierre Gruet
To speed up Gaussian process inference, a number of fast kernel matrix-vector multiplication (MVM) approximation algorithms have been proposed over the years. In this paper, we establish an exact fast kernel MVM algorithm based on exact kernel decomposition into weighted empirical cumulative distribution functions, compatible with a class of kernels which includes multivariate Matérn kernels with half-integer smoothness parameter. This algorithm uses a divide-and-conquer approach, during which sorting outputs are stored in a data structure. We also propose a new algorithm to take into account some linear fixed effects predictor function. Our numerical experiments confirm that our algorithm is very effective for low-dimensional Gaussian process inference problems with hundreds of thousands of data points. An implementation of our algorithm is available at https://gitlab.com/warin/fastgaussiankernelregression.git.
LGFeb 7, 2020
Discretization and Machine Learning Approximation of BSDEs with a Constraint on the Gains-ProcessIdris Kharroubi, Thomas Lim, Xavier Warin
We study the approximation of backward stochastic differential equations (BSDEs for short) with a constraint on the gains process. We first discretize the constraint by applying a so-called facelift operator at times of a grid. We show that this discretely constrained BSDE converges to the continuously constrained one as the mesh grid converges to zero. We then focus on the approximation of the discretely constrained BSDE. For that we adopt a machine learning approach. We show that the facelift can be approximated by an optimization problem over a class of neural networks under constraints on the neural network and its derivative. We then derive an algorithm converging to the discretely constrained BSDE as the number of neurons goes to infinity. We end by numerical experiments. Mathematics Subject Classification (2010): 65C30, 65M75, 60H35, 93E20, 49L25.
OCJul 31, 2019
Neural networks-based backward scheme for fully nonlinear PDEsHuyen Pham, Xavier Warin, Maximilien Germain
We propose a numerical method for solving high dimensional fully nonlinear partial differential equations (PDEs). Our algorithm estimates simultaneously by backward time induction the solution and its gradient by multi-layer neural networks, while the Hessian is approximated by automatic differentiation of the gradient at previous step. This methodology extends to the fully nonlinear case the approach recently proposed in \cite{HPW19} for semi-linear PDEs. Numerical tests illustrate the performance and accuracy of our method on several examples in high dimension with nonlinearity on the Hessian term including a linear quadratic control problem with control on the diffusion coefficient, Monge-Amp{è}re equation and Hamilton-Jacobi-Bellman equation in portfolio optimization.
PRFeb 5, 2019
Deep backward schemes for high-dimensional nonlinear PDEsCôme Huré, Huyên Pham, Xavier Warin
We propose new machine learning schemes for solving high dimensional nonlinear partial differential equations (PDEs). Relying on the classical backward stochastic differential equation (BSDE) representation of PDEs, our algorithms estimate simultaneously the solution and its gradient by deep neural networks. These approximations are performed at each time step from the minimization of loss functions defined recursively by backward induction. The methodology is extended to variational inequalities arising in optimal stopping problems. We analyze the convergence of the deep learning schemes and provide error estimates in terms of the universal approximation of neural networks. Numerical results show that our algorithms give very good results till dimension 50 (and certainly above), for both PDEs and variational inequalities problems. For the PDEs resolution, our results are very similar to those obtained by the recent method in \cite{weinan2017deep} when the latter converges to the right solution or does not diverge. Numerical tests indicate that the proposed methods are not stuck in poor local minimaas it can be the case with the algorithm designed in \cite{weinan2017deep}, and no divergence is experienced. The only limitation seems to be due to the inability of the considered deep neural networks to represent a solution with a too complex structure in high dimension.
LGSep 20, 2018
Machine Learning for semi linear PDEsQuentin Chan-Wai-Nam, Joseph Mikael, Xavier Warin
Recent machine learning algorithms dedicated to solving semi-linear PDEs are improved by using different neural network architectures and different parameterizations. These algorithms are compared to a new one that solves a fixed point problem by using deep learning techniques. This new algorithm appears to be competitive in terms of accuracy with the best existing algorithms.
NAJul 28, 2017
Numerical approximation of BSDEs using local polynomial drivers and branching processesBruno Bouchard, Xiaolu Tan, Xavier Warin et al.
We propose a new numerical scheme for Backward Stochastic Differential Equations based on branching processes. We approximate an arbitrary (Lipschitz) driver by local polynomials and then use a Picard iteration scheme. Each step of the Picard iteration can be solved by using a representation in terms of branching diffusion systems, thus avoiding the need for a fine time discretization. In contrast to the previous literature on the numerical resolution of BSDEs based on branching processes, we prove the convergence of our numerical scheme without limitation on the time horizon. Numerical simulations are provided to illustrate the performance of the algorithm.
PRAug 25, 2010
A Probabilistic Numerical Method for Fully Nonlinear Parabolic PDEsArash Fahim, Nizar Touzi, Xavier Warin
We consider the probabilistic numerical scheme for fully nonlinear PDEs suggested in \cite{cstv}, and show that it can be introduced naturally as a combination of Monte Carlo and finite differences scheme without appealing to the theory of backward stochastic differential equations. Our first main result provides the convergence of the discrete-time approximation and derives a bound on the discretization error in terms of the time step. An explicit implementable scheme requires to approximate the conditional expectation operators involved in the discretization. This induces a further Monte Carlo error. Our second main result is to prove the convergence of the latter approximation scheme, and to derive an upper bound on the approximation error. Numerical experiments are performed for the approximation of the solution of the mean curvature flow equation in dimensions two and three, and for two and five-dimensional (plus time) fully-nonlinear Hamilton-Jacobi-Bellman equations arising in the theory of portfolio optimization in financial mathematics.