Correcting Auto-Differentiation in Neural-ODE Training
This addresses a specific technical issue in neural ODE training for researchers and practitioners using high-order numerical methods, representing an incremental improvement to gradient computation.
The paper tackles the problem of auto-differentiation producing inaccurate gradients in neural ODE training when using high-order numerical methods, showing that this causes artificial oscillations that prevent convergence. They propose simple post-processing techniques for Leapfrog and 2-stage explicit Runge-Kutta methods that correct these gradients and enable accurate updates.
Does the use of auto-differentiation yield reasonable updates for deep neural networks (DNNs)? Specifically, when DNNs are designed to adhere to neural ODE architectures, can we trust the gradients provided by auto-differentiation? Through mathematical analysis and numerical evidence, we demonstrate that when neural networks employ high-order methods, such as Linear Multistep Methods (LMM) or Explicit Runge-Kutta Methods (ERK), to approximate the underlying ODE flows, brute-force auto-differentiation often introduces artificial oscillations in the gradients that prevent convergence. In the case of Leapfrog and 2-stage ERK, we propose simple post-processing techniques that effectively eliminates these oscillations, correct the gradient computation and thus returns the accurate updates.