What shapes the loss landscape of self-supervised learning?
This work provides theoretical insights into a fundamental issue in self-supervised learning, which is incremental as it builds on existing design principles to deepen understanding.
The authors tackled the problem of understanding when and why dimensional collapse occurs in self-supervised learning by developing an analytically tractable theory of SSL loss landscapes, identifying causes and studying effects like normalization and bias, and used this theory to explain how collapse can be beneficial and affect robustness against data imbalance.
Prevention of complete and dimensional collapse of representations has recently become a design principle for self-supervised learning (SSL). However, questions remain in our theoretical understanding: When do those collapses occur? What are the mechanisms and causes? We answer these questions by deriving and thoroughly analyzing an analytically tractable theory of SSL loss landscapes. In this theory, we identify the causes of the dimensional collapse and study the effect of normalization and bias. Finally, we leverage the interpretability afforded by the analytical theory to understand how dimensional collapse can be beneficial and what affects the robustness of SSL against data imbalance.