CVAIJun 21, 2022

One-stage Action Detection Transformer

arXiv:2206.10080v1h-index: 8
Originality Incremental advance
AI Analysis

This work addresses action detection for video analysis, but it is incremental as it builds on existing transformer and one-stage detection methods.

The authors tackled the problem of action detection in videos by proposing a one-stage transformer model that simultaneously recognizes categories and time boundaries, achieving 21.28% action mAP and ranking first on the EPIC-KITCHENS-100 2022 test-set.

In this work, we introduce our solution to the EPIC-KITCHENS-100 2022 Action Detection challenge. One-stage Action Detection Transformer (OADT) is proposed to model the temporal connection of video segments. With the help of OADT, both the category and time boundary can be recognized simultaneously. After ensembling multiple OADT models trained from different features, our model can reach 21.28\% action mAP and ranks the 1st on the test-set of the Action detection challenge.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes