LGDIS-NNMLMar 18, 2019

On-line learning dynamics of ReLU neural networks using statistical physics techniques

arXiv:1903.07378v111 citations
Originality Incremental advance
AI Analysis

This work provides theoretical insights into neural network training dynamics, which is an incremental contribution to the field of machine learning theory.

The authors tackled the problem of understanding the on-line learning dynamics of two-layer ReLU neural networks by deriving exact macroscopic differential equations using statistical physics techniques. They found that ReLU networks exhibit similar behavior to sigmoidal networks in initial experiments but show distinctive characteristics in overrealizable and unrealizable scenarios.

We introduce exact macroscopic on-line learning dynamics of two-layer neural networks with ReLU units in the form of a system of differential equations, using techniques borrowed from statistical physics. For the first experiments, numerical solutions reveal similar behavior compared to sigmoidal activation researched in earlier work. In these experiments the theoretical results show good correspondence with simulations. In ove-rrealizable and unrealizable learning scenarios, the learning behavior of ReLU networks shows distinctive characteristics compared to sigmoidal networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes