CVMay 30, 2017

Generic Tubelet Proposals for Action Localization

arXiv:1705.10861v130 citations
Originality Highly original
AI Analysis

This addresses the problem of accurately localizing actions in videos for computer vision applications, representing a novel method for a known bottleneck rather than an incremental improvement.

The paper tackles action localization in videos by proposing a Tube Proposal Network (TPN) to generate generic, class-independent tubelet proposals, which are integrated into a unified temporal deep network, achieving state-of-the-art localization results on UCF-Sports, J-HMDB21, and UCF-101 datasets.

We develop a novel framework for action localization in videos. We propose the Tube Proposal Network (TPN), which can generate generic, class-independent, video-level tubelet proposals in videos. The generated tubelet proposals can be utilized in various video analysis tasks, including recognizing and localizing actions in videos. In particular, we integrate these generic tubelet proposals into a unified temporal deep network for action classification. Compared with other methods, our generic tubelet proposal method is accurate, general, and is fully differentiable under a smoothL1 loss function. We demonstrate the performance of our algorithm on the standard UCF-Sports, J-HMDB21, and UCF-101 datasets. Our class-independent TPN outperforms other tubelet generation methods, and our unified temporal deep network achieves state-of-the-art localization results on all three datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes