Neural collapse in the orthoplex regime
This work addresses the neural collapse phenomenon in classification tasks for language models and similar applications where class count is high, providing theoretical insights into emergent geometries.
The paper characterizes the geometric figures that emerge during neural collapse in the orthoplex regime where the number of classes exceeds the feature dimension, specifically for d+2 ≤ n ≤ 2d, using techniques like Radon's theorem and convexity.
When training a neural network for classification, the feature vectors of the training set are known to collapse to the vertices of a regular simplex, provided the dimension $d$ of the feature space and the number $n$ of classes satisfies $n\leq d+1$. This phenomenon is known as neural collapse. For other applications like language models, one instead takes $n\gg d$. Here, the neural collapse phenomenon still occurs, but with different emergent geometric figures. We characterize these geometric figures in the orthoplex regime where $d+2\leq n\leq 2d$. The techniques in our analysis primarily involve Radon's theorem and convexity.