Neural Collapse with Cross-Entropy Loss
This work provides a theoretical justification for the neural collapse phenomenon observed in deep learning, which is significant for researchers studying the dynamics of neural networks.
This paper investigates the cross-entropy loss for $n$ feature vectors on a unit hypersphere. It proves that when the dimension $d$ is greater than or equal to $n-1$, the global minimum is achieved by a simplex equiangular tight frame, explaining neural collapse. It also shows that as $n$ approaches infinity for a fixed $d$, the minimizing points become uniformly distributed on the hypersphere.
We consider the variational problem of cross-entropy loss with $n$ feature vectors on a unit hypersphere in $\mathbb{R}^d$. We prove that when $d \geq n - 1$, the global minimum is given by the simplex equiangular tight frame, which justifies the neural collapse behavior. We also prove that as $n \rightarrow \infty$ with fixed $d$, the minimizing points will distribute uniformly on the hypersphere and show a connection with the frame potential of Benedetto & Fickus.