LGMLOct 11, 2021

Towards Demystifying Representation Learning with Non-contrastive Self-supervision

arXiv:2110.04947v217.233 citations
Originality Incremental advance
AI Analysis

This work provides theoretical insights into non-contrastive self-supervised learning, which is important for researchers in representation learning, but it is incremental as it builds on prior work like Tian et al. 2021.

The paper tackles the lack of theoretical understanding in non-contrastive self-supervised learning methods like BYOL and SimSiam by proving that in a linear network, these methods learn a desirable projection matrix and reduce sample complexity on downstream tasks, with experiments showing that their new algorithm DirectCopy rivals or outperforms DirectPred on datasets such as STL-10, CIFAR-10, CIFAR-100, and ImageNet.

Non-contrastive methods of self-supervised learning (such as BYOL and SimSiam) learn representations by minimizing the distance between two views of the same image. These approaches have achieved remarkable performance in practice, but the theoretical understanding lags behind. Tian et al. 2021 explained why the representation does not collapse to zero, however, how the feature is learned still remains mysterious. In our work, we prove in a linear network, non-contrastive methods learn a desirable projection matrix and also reduce the sample complexity on downstream tasks. Our analysis suggests that weight decay acts as an implicit threshold that discards the features with high variance under data augmentations, and keeps the features with low variance. Inspired by our theory, we design a simpler and more computationally efficient algorithm DirectCopy by removing the eigen-decomposition step in the original DirectPred algorithm in Tian et al. 2021. Our experiments show that DirectCopy rivals or even outperforms DirectPred on STL-10, CIFAR-10, CIFAR-100, and ImageNet.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes