CVMar 12, 2020

Top-1 Solution of Multi-Moments in Time Challenge 2019

arXiv:2003.05837v23 citationsHas Code
AI Analysis

This work addresses video action recognition for computer vision researchers, but it is incremental as it builds on and ensembles established methods.

The team tackled the Multi-Moments in Time challenge by proposing a novel temporal interlacing network and exploring existing methods like SlowFast, achieving first place with 67.22% on validation and 60.77% on test sets.

In this technical report, we briefly introduce the solutions of our team 'Efficient' for the Multi-Moments in Time challenge in ICCV 2019. We first conduct several experiments with popular Image-Based action recognition methods TRN, TSN, and TSM. Then a novel temporal interlacing network is proposed towards fast and accurate recognition. Besides, the SlowFast network and its variants are explored. Finally, we ensemble all the above models and achieve 67.22\% on the validation set and 60.77\% on the test set, which ranks 1st on the final leaderboard. In addition, we release a new code repository for video understanding which unifies state-of-the-art 2D and 3D methods based on PyTorch. The solution of the challenge is also included in the repository, which is available at https://github.com/Sense-X/X-Temporal.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes