Refining Action Boundaries for One-stage Detection
This addresses boundary refinement for action detection, an incremental improvement in video analysis.
The paper tackled inaccurate action boundaries in one-stage detection by incorporating boundary confidence estimation, achieving state-of-the-art performance on EPIC-KITCHENS-100 and THUMOS14 benchmarks.
Current one-stage action detection methods, which simultaneously predict action boundaries and the corresponding class, do not estimate or use a measure of confidence in their boundary predictions, which can lead to inaccurate boundaries. We incorporate the estimation of boundary confidence into one-stage anchor-free detection, through an additional prediction head that predicts the refined boundaries with higher confidence. We obtain state-of-the-art performance on the challenging EPIC-KITCHENS-100 action detection as well as the standard THUMOS14 action detection benchmarks, and achieve improvement on the ActivityNet-1.3 benchmark.