LGCVNEDec 12, 2020

Delay Differential Neural Networks

arXiv:2012.06800v18 citations
AI Analysis

This work provides a more parameter-efficient continuous-depth model for researchers and practitioners working with neural ordinary differential equations and ResNet architectures.

This paper introduces Delay Differential Neural Networks (DDNNs), which model the derivative of hidden feature vectors as a function of current and past feature vectors, drawing inspiration from delay differential equations. This approach leads to continuous-depth alternatives for ResNet variants and improves data efficiency over Neural Ordinary Differential Equations (NODEs) by reducing parameters without sacrificing generalization.

Neural ordinary differential equations (NODEs) treat computation of intermediate feature vectors as trajectories of ordinary differential equation parameterized by a neural network. In this paper, we propose a novel model, delay differential neural networks (DDNN), inspired by delay differential equations (DDEs). The proposed model considers the derivative of the hidden feature vector as a function of the current feature vector and past feature vectors (history). The function is modelled as a neural network and consequently, it leads to continuous depth alternatives to many recent ResNet variants. We propose two different DDNN architectures, depending on the way current and past feature vectors are considered. For training DDNNs, we provide a memory-efficient adjoint method for computing gradients and back-propagate through the network. DDNN improves the data efficiency of NODE by further reducing the number of parameters without affecting the generalization performance. Experiments conducted on synthetic and real-world image classification datasets such as Cifar10 and Cifar100 show the effectiveness of the proposed models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes