LG OC MLOct 1, 2019

Truth or Backpropaganda? An Empirical Investigation of Deep Learning Theory

Micah Goldblum, Jonas Geiping, Avi Schwarzschild, Michael Moeller, Tom Goldstein

arXiv:1910.00359v316.536 citationsHas Code

Originality Highly original

AI Analysis

It challenges widely held beliefs in deep learning theory, potentially affecting practitioners and theorists by questioning foundational assumptions.

This paper empirically investigates common assumptions in deep learning theory, proving the existence of suboptimal local minima in neural network loss landscapes and showing that small-norm parameters are not optimal for generalization.

We empirically evaluate common assumptions about neural networks that are widely held by practitioners and theorists alike. In this work, we: (1) prove the widespread existence of suboptimal local minima in the loss landscape of neural networks, and we use our theory to find examples; (2) show that small-norm parameters are not optimal for generalization; (3) demonstrate that ResNets do not conform to wide-network theories, such as the neural tangent kernel, and that the interaction between skip connections and batch normalization plays a role; (4) find that rank does not correlate with generalization or robustness in a practical setting.

View on arXiv PDF Code

Similar