CVAILGOct 18, 2021

Understanding Dimensional Collapse in Contrastive Self-supervised Learning

arXiv:2110.09348v3478 citations
Originality Incremental advance
AI Analysis

This addresses a fundamental limitation in contrastive learning for computer vision, offering a solution to improve representation quality, though it is incremental as it builds on existing contrastive frameworks.

The paper identifies that dimensional collapse, where embeddings span a lower-dimensional subspace, occurs in contrastive self-supervised learning, and proposes DirectCLR, a method that optimizes the representation space directly without a trainable projector, outperforming SimCLR on ImageNet.

Self-supervised visual representation learning aims to learn useful representations without relying on human annotations. Joint embedding approach bases on maximizing the agreement between embedding vectors from different views of the same image. Various methods have been proposed to solve the collapsing problem where all embedding vectors collapse to a trivial constant solution. Among these methods, contrastive learning prevents collapse via negative sample pairs. It has been shown that non-contrastive methods suffer from a lesser collapse problem of a different nature: dimensional collapse, whereby the embedding vectors end up spanning a lower-dimensional subspace instead of the entire available embedding space. Here, we show that dimensional collapse also happens in contrastive learning. In this paper, we shed light on the dynamics at play in contrastive learning that leads to dimensional collapse. Inspired by our theory, we propose a novel contrastive learning method, called DirectCLR, which directly optimizes the representation space without relying on an explicit trainable projector. Experiments show that DirectCLR outperforms SimCLR with a trainable linear projector on ImageNet.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes