CVMay 6, 2020

Exploiting Inter-Frame Regional Correlation for Efficient Action Recognition

arXiv:2005.02591v12 citations
AI Analysis

This addresses efficiency issues in video analysis for researchers and practitioners, though it is incremental as it builds on existing temporal feature extraction methods.

The paper tackles the high computational cost of optical flow in video action recognition by proposing a regional inter-frame correlation method, achieving state-of-the-art results of 96.3% on UCF101 and 76.3% on HMDB51.

Temporal feature extraction is an important issue in video-based action recognition. Optical flow is a popular method to extract temporal feature, which produces excellent performance thanks to its capacity of capturing pixel-level correlation information between consecutive frames. However, such a pixel-level correlation is extracted at the cost of high computational complexity and large storage resource. In this paper, we propose a novel temporal feature extraction method, named Attentive Correlated Temporal Feature (ACTF), by exploring inter-frame correlation within a certain region. The proposed ACTF exploits both bilinear and linear correlation between successive frames on the regional level. Our method has the advantage of achieving performance comparable to or better than optical flow-based methods while avoiding the introduction of optical flow. Experimental results demonstrate our proposed method achieves the state-of-the-art performances of 96.3% on UCF101 and 76.3% on HMDB51 benchmark datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes