CVJun 22, 2017

A Self-Adaptive Proposal Model for Temporal Action Detection based on Reinforcement Learning

arXiv:1706.07251v13 citations
Originality Incremental advance
AI Analysis

This addresses the problem of high computational overhead in video action detection for researchers and practitioners, representing an incremental improvement over existing methods.

The paper tackles the computational inefficiency of existing temporal action detection methods by proposing a self-adaptive proposal model based on reinforcement learning, which achieves competitive performance on THUMOS 2014 with significantly fewer proposals.

Existing action detection algorithms usually generate action proposals through an extensive search over the video at multiple temporal scales, which brings about huge computational overhead and deviates from the human perception procedure. We argue that the process of detecting actions should be naturally one of observation and refinement: observe the current window and refine the span of attended window to cover true action regions. In this paper, we propose an active action proposal model that learns to find actions through continuously adjusting the temporal bounds in a self-adaptive way. The whole process can be deemed as an agent, which is firstly placed at a position in the video at random, adopts a sequence of transformations on the current attended region to discover actions according to a learned policy. We utilize reinforcement learning, especially the Deep Q-learning algorithm to learn the agent's decision policy. In addition, we use temporal pooling operation to extract more effective feature representation for the long temporal window, and design a regression network to adjust the position offsets between predicted results and the ground truth. Experiment results on THUMOS 2014 validate the effectiveness of the proposed approach, which can achieve competitive performance with current action detection algorithms via much fewer proposals.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes