LG NA MLJan 5, 2020

Scalable Gradients for Stochastic Differential Equations

Xuechen Li, Ting-Kam Leonard Wong, Ricky T. Q. Chen, David Duvenaud

arXiv:2001.01328v638.4431 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses a computational bottleneck for researchers and practitioners using stochastic differential equations in machine learning, though it appears incremental as an extension of existing adjoint methods.

The authors tackled the problem of computing gradients for stochastic differential equations by generalizing the adjoint sensitivity method, enabling time-efficient and constant-memory gradient computation with high-order adaptive solvers. They applied this method to fit neural network-defined stochastic dynamics, achieving competitive performance on a 50-dimensional motion capture dataset.

The adjoint sensitivity method scalably computes gradients of solutions to ordinary differential equations. We generalize this method to stochastic differential equations, allowing time-efficient and constant-memory computation of gradients with high-order adaptive solvers. Specifically, we derive a stochastic differential equation whose solution is the gradient, a memory-efficient algorithm for caching noise, and conditions under which numerical solutions converge. In addition, we combine our method with gradient-based stochastic variational inference for latent stochastic differential equations. We use our method to fit stochastic dynamics defined by neural networks, achieving competitive performance on a 50-dimensional motion capture dataset.

View on arXiv PDF Code

Similar