CVAug 28, 2024

Online pre-training with long-form videos

arXiv:2408.15651v1h-index: 3
Originality Synthesis-oriented
AI Analysis

This work addresses action recognition for video analysis, but it appears incremental as it compares existing pre-training methods on a new data type.

The study tackled the problem of improving action recognition by pre-training models on long-form videos, finding that online pre-training with contrastive learning achieved the highest performance on downstream tasks.

In this study, we investigate the impact of online pre-training with continuous video clips. We will examine three methods for pre-training (masked image modeling, contrastive learning, and knowledge distillation), and assess the performance on downstream action recognition tasks. As a result, online pre-training with contrast learning showed the highest performance in downstream tasks. Our findings suggest that learning from long-form videos can be helpful for action recognition with short videos.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes