LGMar 4, 2023

Towards a Unified Theoretical Understanding of Non-contrastive Learning via Rank Differential Mechanism

Zhijian Zhuo, Yifei Wang, Jinwen Ma, Yisen Wang

MIT

arXiv:2303.02387v123.234 citationsh-index: 33Has Code

Originality Incremental advance

AI Analysis

This provides a foundational theoretical framework for researchers in self-supervised learning, though it is incremental in building on existing methods.

The paper tackles the lack of a unified theoretical understanding of how various non-contrastive learning methods avoid feature collapse, proposing the Rank Differential Mechanism (RDM) theory that explains this across different asymmetric designs and provides guidelines for new variants, which achieve comparable or better performance on benchmark datasets.

Recently, a variety of methods under the name of non-contrastive learning (like BYOL, SimSiam, SwAV, DINO) show that when equipped with some asymmetric architectural designs, aligning positive pairs alone is sufficient to attain good performance in self-supervised visual learning. Despite some understandings of some specific modules (like the predictor in BYOL), there is yet no unified theoretical understanding of how these seemingly different asymmetric designs can all avoid feature collapse, particularly considering methods that also work without the predictor (like DINO). In this work, we propose a unified theoretical understanding for existing variants of non-contrastive learning. Our theory named Rank Differential Mechanism (RDM) shows that all these asymmetric designs create a consistent rank difference in their dual-branch output features. This rank difference will provably lead to an improvement of effective dimensionality and alleviate either complete or dimensional feature collapse. Different from previous theories, our RDM theory is applicable to different asymmetric designs (with and without the predictor), and thus can serve as a unified understanding of existing non-contrastive learning methods. Besides, our RDM theory also provides practical guidelines for designing many new non-contrastive variants. We show that these variants indeed achieve comparable performance to existing methods on benchmark datasets, and some of them even outperform the baselines. Our code is available at \url{https://github.com/PKU-ML/Rank-Differential-Mechanism}.

View on arXiv PDF Code

Similar