CVJul 2, 2024

Multi-Grained Contrast for Data-Efficient Unsupervised Representation Learning

arXiv:2407.02014v15 citationsh-index: 4Has Code
Originality Highly original
AI Analysis

This addresses the need for more generalizable unsupervised representations in computer vision, offering a data-efficient solution with broad applicability across tasks like object detection and segmentation.

The paper tackles the problem of limited transferability in single-grained contrastive learning by proposing a Multi-Grained Contrast method to learn representations across various granularity levels, resulting in significant outperformance over state-of-the-art methods on multiple downstream tasks without large-scale pretraining.

The existing contrastive learning methods mainly focus on single-grained representation learning, e.g., part-level, object-level or scene-level ones, thus inevitably neglecting the transferability of representations on other granularity levels. In this paper, we aim to learn multi-grained representations, which can effectively describe the image on various granularity levels, thus improving generalization on extensive downstream tasks. To this end, we propose a novel Multi-Grained Contrast method (MGC) for unsupervised representation learning. Specifically, we construct delicate multi-grained correspondences between positive views and then conduct multi-grained contrast by the correspondences to learn more general unsupervised representations. Without pretrained on large-scale dataset, our method significantly outperforms the existing state-of-the-art methods on extensive downstream tasks, including object detection, instance segmentation, scene parsing, semantic segmentation and keypoint detection. Moreover, experimental results support the data-efficient property and excellent representation transferability of our method. The source code and trained weights are available at \url{https://github.com/visresearch/mgc}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes