CVAug 25, 2020

Boundary Uncertainty in a Single-Stage Temporal Action Localization Network

arXiv:2008.11170v13 citations
Originality Incremental advance
AI Analysis

This addresses the problem of accurate action boundary detection in videos for computer vision applications, representing an incremental advance with a novel uncertainty modeling approach.

The paper tackles temporal action localization by modeling boundary predictions as Gaussian distributions to capture uncertainty, achieving over 1.5% improvement in mAP@tIoU=0.5 and performing competitively with more complex networks.

In this paper, we address the problem of temporal action localization with a single stage neural network. In the proposed architecture we model the boundary predictions as uni-variate Gaussian distributions in order to model their uncertainties, which is the first in this area to the best of our knowledge. We use two uncertainty-aware boundary regression losses: first, the Kullback-Leibler divergence between the ground truth location of the boundary and the Gaussian modeling the prediction of the boundary and second, the expectation of the $\ell_1$ loss under the same Gaussian. We show that with both uncertainty modeling approaches improve the detection performance by more than $1.5\%$ in mAP@tIoU=0.5 and that the proposed simple one-stage network performs closely to more complex one and two stage networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes