CVLGJun 1, 2022

Generalized Supervised Contrastive Learning

arXiv:2206.00384v24 citationsh-index: 21
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in supervised contrastive learning for computer vision researchers, enabling better integration with techniques like CutMix and knowledge distillation, though it is incremental in extending existing contrastive approaches.

The paper tackles the limitation of supervised contrastive learning in handling label probability distributions by introducing a generalized supervised contrastive loss, achieving a top-1 accuracy of 77.3% on ImageNet with a 4.1% relative improvement over traditional methods and setting new state-of-the-art accuracies of 98.2% on CIFAR10 and 87.0% on CIFAR100 for ResNet50.

With the recent promising results of contrastive learning in the self-supervised learning paradigm, supervised contrastive learning has successfully extended these contrastive approaches to supervised contexts, outperforming cross-entropy on various datasets. However, supervised contrastive learning inherently employs label information in a binary form--either positive or negative--using a one-hot target vector. This structure struggles to adapt to methods that exploit label information as a probability distribution, such as CutMix and knowledge distillation. In this paper, we introduce a generalized supervised contrastive loss, which measures cross-entropy between label similarity and latent similarity. This concept enhances the capabilities of supervised contrastive loss by fully utilizing the label distribution and enabling the adaptation of various existing techniques for training modern neural networks. Leveraging this generalized supervised contrastive loss, we construct a tailored framework: the Generalized Supervised Contrastive Learning (GenSCL). Compared to existing contrastive learning frameworks, GenSCL incorporates additional enhancements, including advanced image-based regularization techniques and an arbitrary teacher classifier. When applied to ResNet50 with the Momentum Contrast technique, GenSCL achieves a top-1 accuracy of 77.3% on ImageNet, a 4.1% relative improvement over traditional supervised contrastive learning. Moreover, our method establishes new state-of-the-art accuracies of 98.2% and 87.0% on CIFAR10 and CIFAR100 respectively when applied to ResNet50, marking the highest reported figures for this architecture.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes