CVJun 2, 2014

Continuous Action Recognition Based on Sequence Alignment

arXiv:1406.0288v167 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of recognizing actions in continuous video streams for computer vision applications, presenting an incremental extension of speech recognition techniques to visual data.

The paper tackles continuous action recognition by proposing dynamic frame warping (DFW) and its extensions for simultaneous classification and segmentation, achieving competitive results on datasets like RAVEL, Hollywood-1, and Hollywood-2.

Continuous action recognition is more challenging than isolated recognition because classification and segmentation must be simultaneously carried out. We build on the well known dynamic time warping (DTW) framework and devise a novel visual alignment technique, namely dynamic frame warping (DFW), which performs isolated recognition based on per-frame representation of videos, and on aligning a test sequence with a model sequence. Moreover, we propose two extensions which enable to perform recognition concomitant with segmentation, namely one-pass DFW and two-pass DFW. These two methods have their roots in the domain of continuous recognition of speech and, to the best of our knowledge, their extension to continuous visual action recognition has been overlooked. We test and illustrate the proposed techniques with a recently released dataset (RAVEL) and with two public-domain datasets widely used in action recognition (Hollywood-1 and Hollywood-2). We also compare the performances of the proposed isolated and continuous recognition algorithms with several recently published methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes