CVAILGMay 11, 2021

VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning

arXiv:2105.04906v31266 citations
Originality Highly original
AI Analysis

This addresses a fundamental issue in self-supervised learning for computer vision, offering a more interpretable solution to prevent collapse.

The paper tackles the collapse problem in self-supervised learning for image representation, where encoders output constant vectors, by introducing VICReg, a method that uses variance regularization to avoid collapse and achieves state-of-the-art results on downstream tasks.

Recent self-supervised methods for image representation learning are based on maximizing the agreement between embedding vectors from different views of the same image. A trivial solution is obtained when the encoder outputs constant vectors. This collapse problem is often avoided through implicit biases in the learning architecture, that often lack a clear justification or interpretation. In this paper, we introduce VICReg (Variance-Invariance-Covariance Regularization), a method that explicitly avoids the collapse problem with a simple regularization term on the variance of the embeddings along each dimension individually. VICReg combines the variance term with a decorrelation mechanism based on redundancy reduction and covariance regularization, and achieves results on par with the state of the art on several downstream tasks. In addition, we show that incorporating our new variance term into other methods helps stabilize the training and leads to performance improvements.

Code Implementations6 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes