CVMar 1, 2019

Progress Regression RNN for Online Spatial-Temporal Action Localization in Unconstrained Videos

arXiv:1903.00304v14 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of accurately localizing actions in videos for applications like surveillance or video analysis, representing an incremental improvement by focusing on temporal relations.

The paper tackles the problem of online spatial-temporal action localization in unconstrained videos by introducing a Progress Regression RNN that uses temporal progress regression to infer actions, achieving state-of-the-art performance on two benchmark datasets.

Previous spatial-temporal action localization methods commonly follow the pipeline of object detection to estimate bounding boxes and labels of actions. However, the temporal relation of an action has not been fully explored. In this paper, we propose an end-to-end Progress Regression Recurrent Neural Network (PR-RNN) for online spatial-temporal action localization, which learns to infer the action by temporal progress regression. Two new action attributes, called progression and progress rate, are introduced to describe the temporal engagement and relative temporal position of an action. In our method, frame-level features are first extracted by a Fully Convolutional Network (FCN). Subsequently, detection results and action progress attributes are regressed by the Convolutional Gated Recurrent Unit (ConvGRU) based on all the observed frames instead of a single frame or a short clip. Finally, a novel online linking method is designed to connect single-frame results to spatial-temporal tubes with the help of the estimated action progress attributes. Extensive experiments demonstrate that the progress attributes improve the localization accuracy by providing more precise temporal position of an action in unconstrained videos. Our proposed PR-RNN achieves the stateof-the-art performance for most of the IoU thresholds on two benchmark datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes