LGAIHCMar 27, 2023

Active Self-Supervised Learning: A Few Low-Cost Relationships Are All You Need

arXiv:2303.15256v215 citationsh-index: 137
Originality Incremental advance
AI Analysis

This work addresses the problem of reducing annotation costs and improving flexibility in representation learning for machine learning practitioners, though it appears incremental by formalizing and generalizing existing principles.

The paper tackles the limitation of self-supervised learning (SSL) requiring known positive views by introducing Positive Active Learning (PAL), which uses an oracle to query semantic relationships between samples, achieving a theoretically grounded framework that extends to supervised and semi-supervised learning and provides a low-cost active learning solution.

Self-Supervised Learning (SSL) has emerged as the solution of choice to learn transferable representations from unlabeled data. However, SSL requires to build samples that are known to be semantically akin, i.e. positive views. Requiring such knowledge is the main limitation of SSL and is often tackled by ad-hoc strategies e.g. applying known data-augmentations to the same input. In this work, we formalize and generalize this principle through Positive Active Learning (PAL) where an oracle queries semantic relationships between samples. PAL achieves three main objectives. First, it unveils a theoretically grounded learning framework beyond SSL, based on similarity graphs, that can be extended to tackle supervised and semi-supervised learning depending on the employed oracle. Second, it provides a consistent algorithm to embed a priori knowledge, e.g. some observed labels, into any SSL losses without any change in the training pipeline. Third, it provides a proper active learning framework yielding low-cost solutions to annotate datasets, arguably bringing the gap between theory and practice of active learning that is based on simple-to-answer-by-non-experts queries of semantic relationships between inputs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes