CVMMMay 25, 2021

Temporal Action Proposal Generation with Transformers

arXiv:2105.12043v131 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of generating precise temporal action proposals in videos for researchers in video understanding, but it is incremental as it adapts existing Transformer architectures to a known task.

The paper tackles the temporal action proposal generation task by introducing TAPG Transformer, a unified framework using Transformers to capture dependencies at different granularities, and it outperforms state-of-the-art methods on benchmarks like ActivityNet-1.3 and THUMOS14.

Transformer networks are effective at modeling long-range contextual information and have recently demonstrated exemplary performance in the natural language processing domain. Conventionally, the temporal action proposal generation (TAPG) task is divided into two main sub-tasks: boundary prediction and proposal confidence prediction, which rely on the frame-level dependencies and proposal-level relationships separately. To capture the dependencies at different levels of granularity, this paper intuitively presents a unified temporal action proposal generation framework with original Transformers, called TAPG Transformer, which consists of a Boundary Transformer and a Proposal Transformer. Specifically, the Boundary Transformer captures long-term temporal dependencies to predict precise boundary information and the Proposal Transformer learns the rich inter-proposal relationships for reliable confidence evaluation. Extensive experiments are conducted on two popular benchmarks: ActivityNet-1.3 and THUMOS14, and the results demonstrate that TAPG Transformer outperforms state-of-the-art methods. Equipped with the existing action classifier, our method achieves remarkable performance on the temporal action localization task. Codes and models will be available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes