The Multi-Modal Video Reasoning and Analyzing Competition
It organizes a competition to benchmark methods for multi-modal video analysis tasks, but is incremental as it builds on existing datasets and tasks.
The paper introduced the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) at ICCV 2021, which included four tracks based on datasets SUTD-TrafficQA and UAV-Human, and summarized the top-performing methods and their results from participants.
In this paper, we introduce the Multi-Modal Video Reasoning and Analyzing Competition (MMVRAC) workshop in conjunction with ICCV 2021. This competition is composed of four different tracks, namely, video question answering, skeleton-based action recognition, fisheye video-based action recognition, and person re-identification, which are based on two datasets: SUTD-TrafficQA and UAV-Human. We summarize the top-performing methods submitted by the participants in this competition and show their results achieved in the competition.