CVMay 25, 2019

Exploring Feature Representation and Training strategies in Temporal Action Localization

arXiv:1905.10608v23 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of understanding performance improvements in temporal action localization for computer vision researchers, but it is incremental as it builds on existing methods.

The paper investigates the impact of feature representation and training strategies on temporal action localization performance, identifying key factors through ablative experiments and proposing a two-stage detector that achieves a state-of-the-art mAP@tIoU=0.5 of 44.2% on THUMOS14.

Temporal action localization has recently attracted significant interest in the Computer Vision community. However, despite the great progress, it is hard to identify which aspects of the proposed methods contribute most to the increase in localization performance. To address this issue, we conduct ablative experiments on feature extraction methods, fixed-size feature representation methods and training strategies, and report how each influences the overall performance. Based on our findings, we propose a two-stage detector that outperforms the state of the art in THUMOS14, achieving a mAP@tIoU=0.5 equal to 44.2%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes