CVNov 17, 2020

Can Semantic Labels Assist Self-Supervised Visual Representation Learning?

arXiv:2011.08621v129 citations
AI Analysis

This work addresses the problem of effectively incorporating semantic labels into self-supervised learning for the computer vision community, suggesting a new direction for improving feature transferability.

This paper investigates the role of semantic labels in self-supervised visual representation learning, arguing that fully-supervised and self-supervised methods learn different types of features. They propose SCAN, a new algorithm that integrates semantic guidance while preserving appearance feature embedding, achieving superior performance on various downstream tasks.

Recently, contrastive learning has largely advanced the progress of unsupervised visual representation learning. Pre-trained on ImageNet, some self-supervised algorithms reported higher transfer learning performance compared to fully-supervised methods, seeming to deliver the message that human labels hardly contribute to learning transferrable visual features. In this paper, we defend the usefulness of semantic labels but point out that fully-supervised and self-supervised methods are pursuing different kinds of features. To alleviate this issue, we present a new algorithm named Supervised Contrastive Adjustment in Neighborhood (SCAN) that maximally prevents the semantic guidance from damaging the appearance feature embedding. In a series of downstream tasks, SCAN achieves superior performance compared to previous fully-supervised and self-supervised methods, and sometimes the gain is significant. More importantly, our study reveals that semantic labels are useful in assisting self-supervised methods, opening a new direction for the community.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes