CVAug 6, 2021

Improving Contrastive Learning by Visualizing Feature Transformation

Rui Zhu, Bingchen Zhao, Jingen Liu, Zhenglong Sun, Chang Wen Chen

arXiv:2108.02982v121.385 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving unsupervised feature learning for computer vision tasks, offering incremental advancements in contrastive learning methods.

The paper tackles the problem of enhancing contrastive self-supervised learning by proposing feature-level data manipulation, specifically extrapolation of positives and interpolation among negatives, which improves accuracy by at least 6.0% on ImageNet-100 and about 2.0% on ImageNet-1K over baselines.

Contrastive learning, which aims at minimizing the distance between positive pairs while maximizing that of negative ones, has been widely and successfully applied in unsupervised feature learning, where the design of positive and negative (pos/neg) pairs is one of its keys. In this paper, we attempt to devise a feature-level data manipulation, differing from data augmentation, to enhance the generic contrastive self-supervised learning. To this end, we first design a visualization scheme for pos/neg score (Pos/neg score indicates cosine similarity of pos/neg pair.) distribution, which enables us to analyze, interpret and understand the learning process. To our knowledge, this is the first attempt of its kind. More importantly, leveraging this tool, we gain some significant observations, which inspire our novel Feature Transformation proposals including the extrapolation of positives. This operation creates harder positives to boost the learning because hard positives enable the model to be more view-invariant. Besides, we propose the interpolation among negatives, which provides diversified negatives and makes the model more discriminative. It is the first attempt to deal with both challenges simultaneously. Experiment results show that our proposed Feature Transformation can improve at least 6.0% accuracy on ImageNet-100 over MoCo baseline, and about 2.0% accuracy on ImageNet-1K over the MoCoV2 baseline. Transferring to the downstream tasks successfully demonstrate our model is less task-bias. Visualization tools and codes https://github.com/DTennant/CL-Visualizing-Feature-Transformation .

View on arXiv PDF Code

Similar