CVMay 25, 2023

Guided Attention for Next Active Object @ EGO4D STA Challenge

arXiv:2305.16066v31 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of anticipating object interactions in egocentric videos for applications like robotics and AR, but it is incremental as it builds on existing methods like StillFast.

The authors tackled the short-term anticipation (STA) challenge for egocentric videos in the EGO4D competition by developing a Guided-Attention mechanism that integrates object detections and spatiotemporal features to enhance motion and contextual information. Their model, built on StillFast, achieved state-of-the-art results on the test set.

In this technical report, we describe the Guided-Attention mechanism based solution for the short-term anticipation (STA) challenge for the EGO4D challenge. It combines the object detections, and the spatiotemporal features extracted from video clips, enhancing the motion and contextual information, and further decoding the object-centric and motion-centric information to address the problem of STA in egocentric videos. For the challenge, we build our model on top of StillFast with Guided Attention applied on fast network. Our model obtains better performance on the validation set and also achieves state-of-the-art (SOTA) results on the challenge test set for EGO4D Short-Term Object Interaction Anticipation Challenge.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes