Robust and Scalable SDE Learning: A Functional Perspective
This addresses a computational bottleneck for researchers and practitioners using SDEs as generative models, though it is incremental as it builds on existing learning methods.
The authors tackled the computational expense of learning stochastic differential equations (SDEs) due to sequential integrators by proposing an importance-sampling estimator that avoids integrators, resulting in lower-variance gradient estimates and embarrassingly parallelizable computation for massive decreases in time.
Stochastic differential equations provide a rich class of flexible generative models, capable of describing a wide range of spatio-temporal processes. A host of recent work looks to learn data-representing SDEs, using neural networks and other flexible function approximators. Despite these advances, learning remains computationally expensive due to the sequential nature of SDE integrators. In this work, we propose an importance-sampling estimator for probabilities of observations of SDEs for the purposes of learning. Crucially, the approach we suggest does not rely on such integrators. The proposed method produces lower-variance gradient estimates compared to algorithms based on SDE integrators and has the added advantage of being embarrassingly parallelizable. This facilitates the effective use of large-scale parallel hardware for massive decreases in computation time.