CVMay 30, 2017

Generic Tubelet Proposals for Action Localization

Jiawei He, Mostafa S. Ibrahim, Zhiwei Deng, Greg Mori

arXiv:1705.10861v16.130 citations

Originality Highly original

AI Analysis

This addresses the problem of accurately localizing actions in videos for computer vision applications, representing a novel method for a known bottleneck rather than an incremental improvement.

The paper tackles action localization in videos by proposing a Tube Proposal Network (TPN) to generate generic, class-independent tubelet proposals, which are integrated into a unified temporal deep network, achieving state-of-the-art localization results on UCF-Sports, J-HMDB21, and UCF-101 datasets.

We develop a novel framework for action localization in videos. We propose the Tube Proposal Network (TPN), which can generate generic, class-independent, video-level tubelet proposals in videos. The generated tubelet proposals can be utilized in various video analysis tasks, including recognizing and localizing actions in videos. In particular, we integrate these generic tubelet proposals into a unified temporal deep network for action classification. Compared with other methods, our generic tubelet proposal method is accurate, general, and is fully differentiable under a smoothL1 loss function. We demonstrate the performance of our algorithm on the standard UCF-Sports, J-HMDB21, and UCF-101 datasets. Our class-independent TPN outperforms other tubelet generation methods, and our unified temporal deep network achieves state-of-the-art localization results on all three datasets.

View on arXiv PDF

Similar