CVNov 5, 2020

Center-wise Local Image Mixture For Contrastive Representation Learning

arXiv:2011.02697v30.008 citations
AI Analysis55

This addresses the limitation of instance discrimination in contrastive learning for representation learning, offering an incremental improvement.

The paper tackles the problem of contrastive learning ignoring semantic similarity among samples by proposing CLIM, which selects positives from other samples via center-wise local image selection and data mixture, achieving 75.5% top-1 accuracy with linear evaluation and 59.3% with 1% labeled fine-tuning.

Contrastive learning based on instance discrimination trains model to discriminate different transformations of the anchor sample from other samples, which does not consider the semantic similarity among samples. This paper proposes a new kind of contrastive learning method, named CLIM, which uses positives from other samples in the dataset. This is achieved by searching local similar samples of the anchor, and selecting samples that are closer to the corresponding cluster center, which we denote as center-wise local image selection. The selected samples are instantiated via an data mixture strategy, which performs as a smoothing regularization. As a result, CLIM encourages both local similarity and global aggregation in a robust way, which we find is beneficial for feature representation. Besides, we introduce \emph{multi-resolution} augmentation, which enables the representation to be scale invariant. We reach 75.5% top-1 accuracy with linear evaluation over ResNet-50, and 59.3% top-1 accuracy when fine-tuned with only 1% labels.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes