LGNAMLJan 5, 2020

Scalable Gradients for Stochastic Differential Equations

arXiv:2001.01328v6422 citations
AI Analysis

This work addresses a computational bottleneck for researchers and practitioners using stochastic differential equations in machine learning, though it appears incremental as an extension of existing adjoint methods.

The authors tackled the problem of computing gradients for stochastic differential equations by generalizing the adjoint sensitivity method, enabling time-efficient and constant-memory gradient computation with high-order adaptive solvers. They applied this method to fit neural network-defined stochastic dynamics, achieving competitive performance on a 50-dimensional motion capture dataset.

The adjoint sensitivity method scalably computes gradients of solutions to ordinary differential equations. We generalize this method to stochastic differential equations, allowing time-efficient and constant-memory computation of gradients with high-order adaptive solvers. Specifically, we derive a stochastic differential equation whose solution is the gradient, a memory-efficient algorithm for caching noise, and conditions under which numerical solutions converge. In addition, we combine our method with gradient-based stochastic variational inference for latent stochastic differential equations. We use our method to fit stochastic dynamics defined by neural networks, achieving competitive performance on a 50-dimensional motion capture dataset.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes