CVJun 22, 2022

NVIDIA-UNIBZ Submission for EPIC-KITCHENS-100 Action Anticipation Challenge 2022

arXiv:2206.10869v11 citationsh-index: 32
Originality Synthesis-oriented
AI Analysis

This work addresses action anticipation in kitchen videos, which is an incremental improvement for computer vision applications.

The authors tackled the EPIC-Kitchens-100 action anticipation challenge by using recurrent-based architectures with a 2.5-second inference context and averaging predictions from multiple models, achieving 19.61% mean top-5 recall and second place on the leaderboard.

In this report, we describe the technical details of our submission for the EPIC-Kitchen-100 action anticipation challenge. Our modelings, the higher-order recurrent space-time transformer and the message-passing neural network with edge learning, are both recurrent-based architectures which observe only 2.5 seconds inference context to form the action anticipation prediction. By averaging the prediction scores from a set of models compiled with our proposed training pipeline, we achieved strong performance on the test set, which is 19.61% overall mean top-5 recall, recorded as second place on the public leaderboard.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes