CVFeb 12, 2025

A Survey on Data Curation for Visual Contrastive Learning: Why Crafting Effective Positive and Negative Pairs Matters

Shasvat Desai, Debasmita Ghose, Deep Chakraborty

arXiv:2502.08134v13.61 citationsh-index: 6

Originality Synthesis-oriented

AI Analysis

This survey addresses the problem of optimizing contrastive pre-training effectiveness for researchers and practitioners working on downstream tasks.

The authors tackled the problem of data curation for visual contrastive learning, highlighting the importance of crafting effective positive and negative pairs for representation quality and training efficiency. A well-curated set of pairs can lead to stronger representations and faster convergence.

Visual contrastive learning aims to learn representations by contrasting similar (positive) and dissimilar (negative) pairs of data samples. The design of these pairs significantly impacts representation quality, training efficiency, and computational cost. A well-curated set of pairs leads to stronger representations and faster convergence. As contrastive pre-training sees wider adoption for solving downstream tasks, data curation becomes essential for optimizing its effectiveness. In this survey, we attempt to create a taxonomy of existing techniques for positive and negative pair curation in contrastive learning, and describe them in detail.

View on arXiv PDF

Similar