MLLGAug 12, 2022

RenyiCL: Contrastive Representation Learning with Skew Renyi Divergence

arXiv:2208.06270v212 citationsh-index: 54
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in self-supervised learning for computer vision and other domains, offering a robust solution with empirical gains, though it appears incremental as it builds on existing contrastive learning frameworks.

The paper tackles the problem of contrastive representation learning being sensitive to data augmentation by proposing RényiCL, a method using skew Rényi divergence to manage harder augmentations effectively, resulting in outperforming other self-supervised methods on ImageNet without extra computational cost.

Contrastive representation learning seeks to acquire useful representations by estimating the shared information between multiple views of data. Here, the choice of data augmentation is sensitive to the quality of learned representations: as harder the data augmentations are applied, the views share more task-relevant information, but also task-irrelevant one that can hinder the generalization capability of representation. Motivated by this, we present a new robust contrastive learning scheme, coined RényiCL, which can effectively manage harder augmentations by utilizing Rényi divergence. Our method is built upon the variational lower bound of Rényi divergence, but a naïve usage of a variational method is impractical due to the large variance. To tackle this challenge, we propose a novel contrastive objective that conducts variational estimation of a skew Rényi divergence and provide a theoretical guarantee on how variational estimation of skew divergence leads to stable training. We show that Rényi contrastive learning objectives perform innate hard negative sampling and easy positive sampling simultaneously so that it can selectively learn useful features and ignore nuisance features. Through experiments on ImageNet, we show that Rényi contrastive learning with stronger augmentations outperforms other self-supervised methods without extra regularization or computational overhead. Moreover, we also validate our method on other domains such as graph and tabular, showing empirical gain over other contrastive methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes