LGDIS-NNMLOct 22, 2018

A jamming transition from under- to over-parametrization affects loss landscape and generalization

arXiv:1810.09665v5160 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of understanding generalization and overfitting in neural networks for machine learning researchers, providing theoretical insights into phase transitions.

The paper investigates a phase transition between under- and over-parametrized regimes in fully-connected networks, showing that it affects loss minimization and generalization error, with the error displaying a cusp at the transition point and slow decay afterward.

We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved. Under some general conditions, we show that this transition is sharp for the hinge loss. In the whole over-parametrized regime, poor minima of the loss are not encountered during training since the number of constraints to satisfy is too small to hamper minimization. Our findings support a link between this transition and the generalization properties of the network: as we increase the number of parameters of a given model, starting from an under-parametrized network, we observe that the generalization error displays three phases: (i) initial decay, (ii) increase until the transition point --- where it displays a cusp --- and (iii) slow decay toward a constant for the rest of the over-parametrized regime. Thereby we identify the region where the classical phenomenon of over-fitting takes place, and the region where the model keeps improving, in line with previous empirical observations for modern neural networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes