CVLGMay 31, 2023

Augmentation-aware Self-supervised Learning with Conditioned Projector

arXiv:2306.06082v46 citations
Originality Incremental advance
AI Analysis

This addresses a limitation in self-supervised learning for practitioners needing robust representations, though it is incremental as it modifies existing architectures without major changes.

The paper tackles the problem that self-supervised learning's invariance to augmentations can harm downstream tasks reliant on affected traits, proposing CASSLE to foster sensitivity by conditioning the projector on augmentations, which improves over various SSL methods and achieves state-of-the-art performance in multiple downstream tasks.

Self-supervised learning (SSL) is a powerful technique for learning from unlabeled data. By learning to remain invariant to applied data augmentations, methods such as SimCLR and MoCo can reach quality on par with supervised approaches. However, this invariance may be detrimental for solving downstream tasks that depend on traits affected by augmentations used during pretraining, such as color. In this paper, we propose to foster sensitivity to such characteristics in the representation space by modifying the projector network, a common component of self-supervised architectures. Specifically, we supplement the projector with information about augmentations applied to images. For the projector to take advantage of this auxiliary conditioning when solving the SSL task, the feature extractor learns to preserve the augmentation information in its representations. Our approach, coined Conditional Augmentation-aware Self-supervised Learning (CASSLE), is directly applicable to typical joint-embedding SSL methods regardless of their objective functions. Moreover, it does not require major changes in the network architecture or prior knowledge of downstream tasks. In addition to an analysis of sensitivity towards different data augmentations, we conduct a series of experiments, which show that CASSLE improves over various SSL methods, reaching state-of-the-art performance in multiple downstream tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes