CVMar 13, 2023

Nearest-Neighbor Inter-Intra Contrastive Learning from Unlabeled Videos

Amazon
arXiv:2303.07317v11 citationsh-index: 24
Originality Incremental advance
AI Analysis

This addresses the problem of limited positive diversity in self-supervised video learning for researchers and practitioners, though it is incremental as it builds on prior contrastive methods.

The paper tackles the limitation of existing video contrastive learning methods that only use clips from the same video as positives by introducing nearest-neighbor videos from the global space as additional positive pairs, which improves performance on various video tasks.

Contrastive learning has recently narrowed the gap between self-supervised and supervised methods in image and video domain. State-of-the-art video contrastive learning methods such as CVRL and $ρ$-MoCo spatiotemporally augment two clips from the same video as positives. By only sampling positive clips locally from a single video, these methods neglect other semantically related videos that can also be useful. To address this limitation, we leverage nearest-neighbor videos from the global space as additional positive pairs, thus improving positive key diversity and introducing a more relaxed notion of similarity that extends beyond video and even class boundaries. Our method, Inter-Intra Video Contrastive Learning (IIVCL), improves performance on a range of video tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes