CVMar 21, 2018

Learning and Recognizing Human Action from Skeleton Movement with Deep Residual Neural Networks

Huy-Hieu Pham, Louahdi Khoudour, Alain Crouzil, Pablo Zegers, Sergio A. Velastin

arXiv:1803.07780v12.517 citations

Originality Synthesis-oriented

AI Analysis

This work addresses action recognition for applications like video surveillance and human-computer interfaces, but it is incremental as it applies an existing method (ResNet) to a specific data type.

The paper tackled human action recognition from skeleton data by using Deep Residual Neural Networks (ResNets) to process transformed joint coordinates as images, achieving state-of-the-art performance on two public datasets.

Automatic human action recognition is indispensable for almost artificial intelligent systems such as video surveillance, human-computer interfaces, video retrieval, etc. Despite a lot of progress, recognizing actions in an unknown video is still a challenging task in computer vision. Recently, deep learning algorithms have proved its great potential in many vision-related recognition tasks. In this paper, we propose the use of Deep Residual Neural Networks (ResNets) to learn and recognize human action from skeleton data provided by Kinect sensor. Firstly, the body joint coordinates are transformed into 3D-arrays and saved in RGB images space. Five different deep learning models based on ResNet have been designed to extract image features and classify them into classes. Experiments are conducted on two public video datasets for human action recognition containing various challenges. The results show that our method achieves the state-of-the-art performance comparing with existing approaches.

View on arXiv PDF

Similar