CVOct 31, 2020

A Survey on Contrastive Self-supervised Learning

arXiv:2011.00362v31728 citations
Originality Synthesis-oriented
AI Analysis

It is a survey paper, summarizing existing work without introducing new methods, aimed at researchers in computer vision, NLP, and related fields.

This paper provides an extensive review of contrastive self-supervised learning methods, explaining pretext tasks, architectures, and comparing performance across downstream tasks like image classification and object detection.

Self-supervised learning has gained popularity because of its ability to avoid the cost of annotating large-scale datasets. It is capable of adopting self-defined pseudo labels as supervision and use the learned representations for several downstream tasks. Specifically, contrastive learning has recently become a dominant component in self-supervised learning methods for computer vision, natural language processing (NLP), and other domains. It aims at embedding augmented versions of the same sample close to each other while trying to push away embeddings from different samples. This paper provides an extensive review of self-supervised methods that follow the contrastive approach. The work explains commonly used pretext tasks in a contrastive learning setup, followed by different architectures that have been proposed so far. Next, we have a performance comparison of different methods for multiple downstream tasks such as image classification, object detection, and action recognition. Finally, we conclude with the limitations of the current methods and the need for further techniques and future directions to make substantial progress.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes