On the Role of Neural Collapse in Transfer Learning
It addresses the problem of understanding transfer learning mechanisms for researchers, but is incremental as it builds on existing neural collapse observations.
The paper explains why foundation models' representations are effective for few-shot learning by linking it to neural collapse, showing both theoretically and empirically that this property generalizes to new classes.
We study the ability of foundation models to learn representations for classification that are transferable to new, unseen classes. Recent results in the literature show that representations learned by a single classifier over many classes are competitive on few-shot learning problems with representations learned by special-purpose algorithms designed for such problems. In this paper we provide an explanation for this behavior based on the recently observed phenomenon that the features learned by overparameterized classification networks show an interesting clustering property, called neural collapse. We demonstrate both theoretically and empirically that neural collapse generalizes to new samples from the training classes, and -- more importantly -- to new classes as well, allowing foundation models to provide feature maps that work well in transfer learning and, specifically, in the few-shot setting.