ActionXPose: A Novel 2D Multi-view Pose-based Algorithm for Real-time Human Action Recognition
This addresses posture-level human action recognition for applications like surveillance or human-computer interaction, but it appears incremental as it builds on existing pose detection methods.
The paper tackles human action recognition by proposing ActionXPose, a 2D pose-based algorithm that processes poses from RGB videos using LSTM and 1D CNN networks, achieving state-of-the-art performance on datasets like i3DPost and KTH with real-time capabilities and robustness to various conditions.
We present ActionXPose, a novel 2D pose-based algorithm for posture-level Human Action Recognition (HAR). The proposed approach exploits 2D human poses provided by OpenPose detector from RGB videos. ActionXPose aims to process poses data to be provided to a Long Short-Term Memory Neural Network and to a 1D Convolutional Neural Network, which solve the classification problem. ActionXPose is one of the first algorithms that exploits 2D human poses for HAR. The algorithm has real-time performance and it is robust to camera movings, subject proximity changes, viewpoint changes, subject appearance changes and provide high generalization degree. In fact, extensive simulations show that ActionXPose can be successfully trained using different datasets at once. State-of-the-art performance on popular datasets for posture-related HAR problems (i3DPost, KTH) are provided and results are compared with those obtained by other methods, including the selected ActionXPose baseline. Moreover, we also proposed two novel datasets called MPOSE and ISLD recorded in our Intelligent Sensing Lab, to show ActionXPose generalization performance.