LGOCJan 31, 2022

Imbedding Deep Neural Networks

arXiv:2202.00113v2
Originality Highly original
AI Analysis

This work addresses the theoretical and practical challenges of network depth in deep learning, offering a novel approach for researchers and practitioners, though it appears incremental in building upon Neural ODEs.

The paper tackles the problem of understanding and optimizing continuous-depth neural networks by introducing a new method based on Invariant Imbedding, which reduces the optimization to forward-facing initial value problems, showing competitive performance in supervised learning and time series prediction.

Continuous-depth neural networks, such as Neural ODEs, have refashioned the understanding of residual neural networks in terms of non-linear vector-valued optimal control problems. The common solution is to use the adjoint sensitivity method to replicate a forward-backward pass optimisation problem. We propose a new approach which explicates the network's `depth' as a fundamental variable, thus reducing the problem to a system of forward-facing initial value problems. This new method is based on the principle of `Invariant Imbedding' for which we prove a general solution, applicable to all non-linear, vector-valued optimal control problems with both running and terminal loss. Our new architectures provide a tangible tool for inspecting the theoretical--and to a great extent unexplained--properties of network depth. They also constitute a resource of discrete implementations of Neural ODEs comparable to classes of imbedded residual neural networks. Through a series of experiments, we show the competitive performance of the proposed architectures for supervised learning and time series prediction.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes