CVJun 6, 2015

First-Take-All: Temporal Order-Preserving Hashing for 3D Action Videos

arXiv:1506.02184v1
Originality Incremental advance
AI Analysis

This addresses the need for scalable storage and retrieval in large-scale applications like user interfaces, though it is an incremental improvement over existing hashing methods.

The paper tackles the problem of scalable and efficient 3D human action recognition by proposing a First-Take-All (FTA) Hashing algorithm that hashes entire videos into fixed-length codes, achieving over 80% recognition accuracy with about 15 bits per frame.

With the prevalence of the commodity depth cameras, the new paradigm of user interfaces based on 3D motion capturing and recognition have dramatically changed the way of interactions between human and computers. Human action recognition, as one of the key components in these devices, plays an important role to guarantee the quality of user experience. Although the model-driven methods have achieved huge success, they cannot provide a scalable solution for efficiently storing, retrieving and recognizing actions in the large-scale applications. These models are also vulnerable to the temporal translation and warping, as well as the variations in motion scales and execution rates. To address these challenges, we propose to treat the 3D human action recognition as a video-level hashing problem and propose a novel First-Take-All (FTA) Hashing algorithm capable of hashing the entire video into hash codes of fixed length. We demonstrate that this FTA algorithm produces a compact representation of the video invariant to the above mentioned variations, through which action recognition can be solved by an efficient nearest neighbor search by the Hamming distance between the FTA hash codes. Experiments on the public 3D human action datasets shows that the FTA algorithm can reach a recognition accuracy higher than 80%, with about 15 bits per frame considering there are 65 frames per video over the datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes