CVMar 21, 2018

Learning and Recognizing Human Action from Skeleton Movement with Deep Residual Neural Networks

arXiv:1803.07780v117 citations
Originality Synthesis-oriented
AI Analysis

This work addresses action recognition for applications like video surveillance and human-computer interfaces, but it is incremental as it applies an existing method (ResNet) to a specific data type.

The paper tackled human action recognition from skeleton data by using Deep Residual Neural Networks (ResNets) to process transformed joint coordinates as images, achieving state-of-the-art performance on two public datasets.

Automatic human action recognition is indispensable for almost artificial intelligent systems such as video surveillance, human-computer interfaces, video retrieval, etc. Despite a lot of progress, recognizing actions in an unknown video is still a challenging task in computer vision. Recently, deep learning algorithms have proved its great potential in many vision-related recognition tasks. In this paper, we propose the use of Deep Residual Neural Networks (ResNets) to learn and recognize human action from skeleton data provided by Kinect sensor. Firstly, the body joint coordinates are transformed into 3D-arrays and saved in RGB images space. Five different deep learning models based on ResNet have been designed to extract image features and classify them into classes. Experiments are conducted on two public video datasets for human action recognition containing various challenges. The results show that our method achieves the state-of-the-art performance comparing with existing approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes