LGNov 25, 2021

Characteristic Neural Ordinary Differential Equations

arXiv:2111.13207v43 citations
Originality Incremental advance
AI Analysis

This work addresses a problem for researchers and practitioners in machine learning by extending NODEs to handle more complex dynamics, though it is incremental as it builds directly on existing NODE frameworks.

The authors tackled the limitation of Neural Ordinary Differential Equations (NODEs) by proposing Characteristic-Neural Ordinary Differential Equations (C-NODEs), which extend NODEs to model latent variable evolution using partial differential equations along characteristic curves, resulting in improved performance and computational efficiency on classification and density estimation tasks for datasets like CIFAR-10, SVHN, and MNIST, with lower parameters and function evaluations compared to baselines.

We propose Characteristic-Neural Ordinary Differential Equations (C-NODEs), a framework for extending Neural Ordinary Differential Equations (NODEs) beyond ODEs. While NODEs model the evolution of a latent variables as the solution to an ODE, C-NODE models the evolution of the latent variables as the solution of a family of first-order quasi-linear partial differential equations (PDEs) along curves on which the PDEs reduce to ODEs, referred to as characteristic curves. This in turn allows the application of the standard frameworks for solving ODEs, namely the adjoint method. Learning optimal characteristic curves for given tasks improves the performance and computational efficiency, compared to state of the art NODE models. We prove that the C-NODE framework extends the classical NODE on classification tasks by demonstrating explicit C-NODE representable functions not expressible by NODEs. Additionally, we present C-NODE-based continuous normalizing flows, which describe the density evolution of latent variables along multiple dimensions. Empirical results demonstrate the improvements provided by the proposed method for classification and density estimation on CIFAR-10, SVHN, and MNIST datasets under a similar computational budget as the existing NODE methods. The results also provide empirical evidence that the learned curves improve the efficiency of the system through a lower number of parameters and function evaluations compared with baselines.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes