LG DS NAMay 21, 2021

Error Bounds of the Invariant Statistics in Machine Learning of Ergodic Itô Diffusions

arXiv:2105.10102v28.48 citations

Originality Incremental advance

AI Analysis

This work addresses theoretical foundations for machine learning of stochastic systems, providing insights into error bounds for practitioners in fields like computational physics or finance, but it is incremental as it builds on existing perturbation and linear response theories.

This paper tackles the problem of understanding convergence properties of invariant statistics when learning ergodic Itô diffusions via supervised regression, finding that errors in these statistics depend linearly on errors in learning drift and diffusion coefficients, and that standard L²-norm generalization error is insufficient for this result.

This paper studies the theoretical underpinnings of machine learning of ergodic Itô diffusions. The objective is to understand the convergence properties of the invariant statistics when the underlying system of stochastic differential equations (SDEs) is empirically estimated with a supervised regression framework. Using the perturbation theory of ergodic Markov chains and the linear response theory, we deduce a linear dependence of the errors of one-point and two-point invariant statistics on the error in the learning of the drift and diffusion coefficients. More importantly, our study shows that the usual $L^2$-norm characterization of the learning generalization error is insufficient for achieving this linear dependence result. We find that sufficient conditions for such a linear dependence result are through learning algorithms that produce a uniformly Lipschitz and consistent estimator in the hypothesis space that retains certain characteristics of the drift coefficients, such as the usual linear growth condition that guarantees the existence of solutions of the underlying SDEs. We examine these conditions on two well-understood learning algorithms: the kernel-based spectral regression method and the shallow random neural networks with the ReLU activation function.

View on arXiv PDF

Similar