Nearest Class-Center Simplification through Intermediate Layers
This addresses geometric feature issues in deep learning for researchers, but is incremental as it builds on known Neural Collapse theory.
The paper investigates Neural Collapse and Nearest Class-Center Mismatch in intermediate layers of deep networks, showing these phenomena occur in vision and language models, and proposes a Stochastic Variability-Simplification Loss (SVSL) that improves training metrics and generalization.
Recent advances in theoretical Deep Learning have introduced geometric properties that occur during training, past the Interpolation Threshold -- where the training error reaches zero. We inquire into the phenomena coined Neural Collapse in the intermediate layers of the networks, and emphasize the innerworkings of Nearest Class-Center Mismatch inside the deepnet. We further show that these processes occur both in vision and language model architectures. Lastly, we propose a Stochastic Variability-Simplification Loss (SVSL) that encourages better geometrical features in intermediate layers, and improves both train metrics and generalization.