CVDec 21, 2022

MoQuad: Motion-focused Quadruple Construction for Video Contrastive Learning

arXiv:2212.10870v13 citationsh-index: 13
Originality Incremental advance
AI Analysis

This work addresses the challenge of motion feature extraction in video analysis for tasks like action recognition, representing an incremental improvement over existing contrastive learning methods.

The paper tackles the problem of learning effective motion features in video representation learning by proposing MoQuad, a sample construction strategy that boosts motion feature learning in video contrastive learning, achieving 93.7% accuracy on UCF-101 action recognition after pre-training on Kinetics-400 for 200 epochs.

Learning effective motion features is an essential pursuit of video representation learning. This paper presents a simple yet effective sample construction strategy to boost the learning of motion features in video contrastive learning. The proposed method, dubbed Motion-focused Quadruple Construction (MoQuad), augments the instance discrimination by meticulously disturbing the appearance and motion of both the positive and negative samples to create a quadruple for each video instance, such that the model is encouraged to exploit motion information. Unlike recent approaches that create extra auxiliary tasks for learning motion features or apply explicit temporal modelling, our method keeps the simple and clean contrastive learning paradigm (i.e.,SimCLR) without multi-task learning or extra modelling. In addition, we design two extra training strategies by analyzing initial MoQuad experiments. By simply applying MoQuad to SimCLR, extensive experiments show that we achieve superior performance on downstream tasks compared to the state of the arts. Notably, on the UCF-101 action recognition task, we achieve 93.7% accuracy after pre-training the model on Kinetics-400 for only 200 epochs, surpassing various previous methods

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes