Sparse learning of stochastic dynamic equations
This work addresses the challenge of modeling biophysical processes using stochastic dynamics, but it is incremental as it builds directly on the existing SINDy method.
The authors tackled the problem of extracting governing equations from data for stochastic dynamical systems, extending the SINDy framework to handle stochasticity and proving its asymptotic correctness in the infinite data limit, with demonstrations on test systems like diffusion processes.
With the rapid increase of available data for complex systems, there is great interest in the extraction of physically relevant information from massive datasets. Recently, a framework called Sparse Identification of Nonlinear Dynamics (SINDy) has been introduced to identify the governing equations of dynamical systems from simulation data. In this study, we extend SINDy to stochastic dynamical systems, which are frequently used to model biophysical processes. We prove the asymptotic correctness of stochastics SINDy in the infinite data limit, both in the original and projected variables. We discuss algorithms to solve the sparse regression problem arising from the practical implementation of SINDy, and show that cross validation is an essential tool to determine the right level of sparsity. We demonstrate the proposed methodology on two test systems, namely, the diffusion in a one-dimensional potential, and the projected dynamics of a two-dimensional diffusion process.