LG MLDec 10, 2020

On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers

arXiv:2012.05420v318.954 citations

Originality Incremental advance

AI Analysis

This work provides an analytical understanding of the internal representations learned by deep neural networks, which is important for researchers studying neural network interpretability and design.

This paper analytically explains the observed phenomenon where highly expressive deep neural networks map all data points within a class to a single point in the penultimate layer, and these class-specific points form a regular simplex. It also rigorously demonstrates that this uniformity in the final output does not occur in shallow networks or when deeper layers fail to achieve the necessary geometric configuration.

A recent numerical study observed that neural network classifiers enjoy a large degree of symmetry in the penultimate layer. Namely, if $h(x) = Af(x) +b$ where $A$ is a linear map and $f$ is the output of the penultimate layer of the network (after activation), then all data points $x_{i, 1}, \dots, x_{i, N_i}$ in a class $C_i$ are mapped to a single point $y_i$ by $f$ and the points $y_i$ are located at the vertices of a regular $k-1$-dimensional standard simplex in a high-dimensional Euclidean space. We explain this observation analytically in toy models for highly expressive deep neural networks. In complementary examples, we demonstrate rigorously that even the final output of the classifier $h$ is not uniform over data samples from a class $C_i$ if $h$ is a shallow network (or if the deeper layers do not bring the data samples into a convenient geometric configuration).

View on arXiv PDF

Similar