CVAug 7, 2016

Multiview Cauchy Estimator Feature Embedding for Depth and Inertial Sensor-Based Human Action Recognition

arXiv:1608.02183v257 citations
AI Analysis

This work addresses human action recognition for applications using depth and inertial sensors, presenting an incremental improvement in feature fusion methods.

The paper tackled the problem of fusing multiview features from Kinect and inertial sensors for human action recognition by proposing an unsupervised method called MCEFE, which achieved effectiveness and robustness as demonstrated on the CAS-YNU-MHAD database.

The ever-growing popularity of Kinect and inertial sensors has prompted intensive research efforts on human action recognition. Since human actions can be characterized by multiple feature representations extracted from Kinect and inertial sensors, multiview features must be encoded into a unified space optimal for human action recognition. In this paper, we propose a new unsupervised feature fusion method termed Multiview Cauchy Estimator Feature Embedding (MCEFE) for human action recognition. By minimizing empirical risk, MCEFE integrates the encoded complementary information in multiple views to find the unified data representation and the projection matrices. To enhance robustness to outliers, the Cauchy estimator is imposed on the reconstruction error. Furthermore, ensemble manifold regularization is enforced on the projection matrices to encode the correlations between different views and avoid overfitting. Experiments are conducted on the new Chinese Academy of Sciences - Yunnan University - Multimodal Human Action Database (CAS-YNU-MHAD) to demonstrate the effectiveness and robustness of MCEFE for human action recognition.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes