LGCVMLAug 18, 2020

Prevalence of Neural Collapse during the terminal phase of deep learning training

arXiv:2008.08186v2892 citations
AI Analysis

This foundational discovery reveals a universal geometric structure in deep learning training that enhances model performance and understanding, with broad implications for the field.

The paper identifies a pervasive inductive bias called Neural Collapse during the terminal phase of deep learning training, where last-layer activations and classifiers collapse to a simplex equiangular tight frame, leading to improved generalization, robustness, and interpretability across multiple architectures and datasets.

Modern practice for training classification deepnets involves a Terminal Phase of Training (TPT), which begins at the epoch where training error first vanishes; During TPT, the training error stays effectively zero while training loss is pushed towards zero. Direct measurements of TPT, for three prototypical deepnet architectures and across seven canonical classification datasets, expose a pervasive inductive bias we call Neural Collapse, involving four deeply interconnected phenomena: (NC1) Cross-example within-class variability of last-layer training activations collapses to zero, as the individual activations themselves collapse to their class-means; (NC2) The class-means collapse to the vertices of a Simplex Equiangular Tight Frame (ETF); (NC3) Up to rescaling, the last-layer classifiers collapse to the class-means, or in other words to the Simplex ETF, i.e. to a self-dual configuration; (NC4) For a given activation, the classifier's decision collapses to simply choosing whichever class has the closest train class-mean, i.e. the Nearest Class Center (NCC) decision rule. The symmetric and very simple geometry induced by the TPT confers important benefits, including better generalization performance, better robustness, and better interpretability.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes