LGOCMLOct 1, 2019

Truth or Backpropaganda? An Empirical Investigation of Deep Learning Theory

arXiv:1910.00359v336 citations
Originality Highly original
AI Analysis

It challenges widely held beliefs in deep learning theory, potentially affecting practitioners and theorists by questioning foundational assumptions.

This paper empirically investigates common assumptions in deep learning theory, proving the existence of suboptimal local minima in neural network loss landscapes and showing that small-norm parameters are not optimal for generalization.

We empirically evaluate common assumptions about neural networks that are widely held by practitioners and theorists alike. In this work, we: (1) prove the widespread existence of suboptimal local minima in the loss landscape of neural networks, and we use our theory to find examples; (2) show that small-norm parameters are not optimal for generalization; (3) demonstrate that ResNets do not conform to wide-network theories, such as the neural tangent kernel, and that the interaction between skip connections and batch normalization plays a role; (4) find that rank does not correlate with generalization or robustness in a practical setting.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes