LGCVOct 21, 2021

Self-Supervised Visual Representation Learning Using Lightweight Architectures

arXiv:2110.11160v1
Originality Synthesis-oriented
AI Analysis

This work provides a comparative analysis for researchers in computer vision, but it is incremental as it builds on existing self-supervised learning methods without introducing new techniques.

The paper examined self-supervised pretext tasks for image feature extraction and conducted experiments on resource-constrained networks to evaluate performance across different model types, sizes, and pre-training amounts, establishing a benchmark for future research.

In self-supervised learning, a model is trained to solve a pretext task, using a data set whose annotations are created by a machine. The objective is to transfer the trained weights to perform a downstream task in the target domain. We critically examine the most notable pretext tasks to extract features from image data and further go on to conduct experiments on resource constrained networks, which aid faster experimentation and deployment. We study the performance of various self-supervised techniques keeping all other parameters uniform. We study the patterns that emerge by varying model type, size and amount of pre-training done for the backbone as well as establish a standard to compare against for future research. We also conduct comprehensive studies to understand the quality of representations learned by different architectures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes