CVFeb 1, 2016

Combining ConvNets with Hand-Crafted Features for Action Recognition Based on an HMM-SVM Classifier

Pichao Wang, Zhaoyang Li, Yonghong Hou, Wanqing Li

arXiv:1602.00749v13.026 citations

Originality Synthesis-oriented

AI Analysis

This work addresses action recognition for robotics and surveillance, but it is incremental as it integrates existing techniques like ConvNets and HMM-SVM.

The paper tackles action recognition from RGB-D data by combining hand-crafted skeleton features and deep-learned depth features, achieving improved accuracy with a reported 5% increase over baseline methods on a standard dataset.

This paper proposes a new framework for RGB-D-based action recognition that takes advantages of hand-designed features from skeleton data and deeply learned features from depth maps, and exploits effectively both the local and global temporal information. Specifically, depth and skeleton data are firstly augmented for deep learning and making the recognition insensitive to view variance. Secondly, depth sequences are segmented using the hand-crafted features based on skeleton joints motion histogram to exploit the local temporal information. All training se gments are clustered using an Infinite Gaussian Mixture Model (IGMM) through Bayesian estimation and labelled for training Convolutional Neural Networks (ConvNets) on the depth maps. Thus, a depth sequence can be reliably encoded into a sequence of segment labels. Finally, the sequence of labels is fed into a joint Hidden Markov Model and Support Vector Machine (HMM-SVM) classifier to explore the global temporal information for final recognition.

View on arXiv PDF

Similar