LGNAJun 3, 2023

Correcting Auto-Differentiation in Neural-ODE Training

arXiv:2306.02192v21 citationsh-index: 64
Originality Incremental advance
AI Analysis

This addresses a specific technical issue in neural ODE training for researchers and practitioners using high-order numerical methods, representing an incremental improvement to gradient computation.

The paper tackles the problem of auto-differentiation producing inaccurate gradients in neural ODE training when using high-order numerical methods, showing that this causes artificial oscillations that prevent convergence. They propose simple post-processing techniques for Leapfrog and 2-stage explicit Runge-Kutta methods that correct these gradients and enable accurate updates.

Does the use of auto-differentiation yield reasonable updates for deep neural networks (DNNs)? Specifically, when DNNs are designed to adhere to neural ODE architectures, can we trust the gradients provided by auto-differentiation? Through mathematical analysis and numerical evidence, we demonstrate that when neural networks employ high-order methods, such as Linear Multistep Methods (LMM) or Explicit Runge-Kutta Methods (ERK), to approximate the underlying ODE flows, brute-force auto-differentiation often introduces artificial oscillations in the gradients that prevent convergence. In the case of Leapfrog and 2-stage ERK, we propose simple post-processing techniques that effectively eliminates these oscillations, correct the gradient computation and thus returns the accurate updates.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes