LGCVNEJan 29, 2022

Understanding Deep Contrastive Learning via Coordinate-wise Optimization

arXiv:2201.12680v746 citations
Originality Incremental advance
AI Analysis

This provides a theoretical framework for contrastive learning that could aid in designing new loss functions, though it is incremental in advancing understanding rather than solving a broad practical problem.

The paper tackles the problem of understanding deep contrastive learning by proposing a unified coordinate-wise optimization formulation called α-CL, which unifies existing contrastive losses and enables the design of novel losses that achieve comparable or better performance on datasets like CIFAR10, STL-10, and CIFAR-100 compared to InfoNCE.

We show that Contrastive Learning (CL) under a broad family of loss functions (including InfoNCE) has a unified formulation of coordinate-wise optimization on the network parameter $\boldsymbolθ$ and pairwise importance $α$, where the \emph{max player} $\boldsymbolθ$ learns representation for contrastiveness, and the \emph{min player} $α$ puts more weights on pairs of distinct samples that share similar representations. The resulting formulation, called $α$-CL, unifies not only various existing contrastive losses, which differ by how sample-pair importance $α$ is constructed, but also is able to extrapolate to give novel contrastive losses beyond popular ones, opening a new avenue of contrastive loss design. These novel losses yield comparable (or better) performance on CIFAR10, STL-10 and CIFAR-100 than classic InfoNCE. Furthermore, we also analyze the max player in detail: we prove that with fixed $α$, max player is equivalent to Principal Component Analysis (PCA) for deep linear network, and almost all local minima are global and rank-1, recovering optimal PCA solutions. Finally, we extend our analysis on max player to 2-layer ReLU networks, showing that its fixed points can have higher ranks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes