LGJan 14, 2022

Taylor-Lagrange Neural Ordinary Differential Equations: Toward Fast Training and Evaluation of Neural ODEs

arXiv:2201.05715v220 citations
AI Analysis

This addresses a critical bottleneck in applying NODEs to practical problems like dynamical systems modeling, image classification, and density estimation, offering significant speed improvements for researchers and practitioners.

The paper tackles the slow training and evaluation of Neural Ordinary Differential Equations (NODEs) by proposing Taylor-Lagrange NODEs, which use fixed-order Taylor expansions and learn error estimates to achieve the same accuracy as adaptive methods with much lower computational cost, resulting in training speeds more than an order of magnitude faster than state-of-the-art approaches without performance loss.

Neural ordinary differential equations (NODEs) -- parametrizations of differential equations using neural networks -- have shown tremendous promise in learning models of unknown continuous-time dynamical systems from data. However, every forward evaluation of a NODE requires numerical integration of the neural network used to capture the system dynamics, making their training prohibitively expensive. Existing works rely on off-the-shelf adaptive step-size numerical integration schemes, which often require an excessive number of evaluations of the underlying dynamics network to obtain sufficient accuracy for training. By contrast, we accelerate the evaluation and the training of NODEs by proposing a data-driven approach to their numerical integration. The proposed Taylor-Lagrange NODEs (TL-NODEs) use a fixed-order Taylor expansion for numerical integration, while also learning to estimate the expansion's approximation error. As a result, the proposed approach achieves the same accuracy as adaptive step-size schemes while employing only low-order Taylor expansions, thus greatly reducing the computational cost necessary to integrate the NODE. A suite of numerical experiments, including modeling dynamical systems, image classification, and density estimation, demonstrate that TL-NODEs can be trained more than an order of magnitude faster than state-of-the-art approaches, without any loss in performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes