MLLGSep 9, 2019

Differential equations as models of deep neural networks

arXiv:1909.03767v23 citations
Originality Incremental advance
AI Analysis

This provides a theoretical framework for understanding neural networks using classical mechanics, which is incremental in connecting existing network types to differential equations.

The paper analyzes differential equations as models for deep neural networks, showing that the loss gradient can be treated as generalized momentum and relating feedforward networks with small nonlinearities to differential equations.

In this work we systematically analyze general properties of differential equations used as machine learning models. We demonstrate that the gradient of the loss function with respect to to the hidden state can be considered as a generalized momentum conjugate to the hidden state, allowing application of the tools of classical mechanics. In addition, we show that not only residual networks, but also feedforward neural networks with small nonlinearities and the weights matrices deviating only slightly from identity matrices can be related to the differential equations. We propose a differential equation describing such networks and investigate its properties.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes