CVLGMar 22, 2017

Two-Stream RNN/CNN for Action Recognition in 3D Videos

arXiv:1703.09783v2116 citations
AI Analysis

This work addresses action recognition for applications like health monitoring and surveillance, representing an incremental improvement through a hybrid method.

The paper tackled action recognition in 3D videos by combining recurrent neural networks (RNNs) and convolutional neural networks (CNNs) in a voting approach, resulting in a 14% improvement in recognition rates over state-of-the-art methods on standard datasets.

The recognition of actions from video sequences has many applications in health monitoring, assisted living, surveillance, and smart homes. Despite advances in sensing, in particular related to 3D video, the methodologies to process the data are still subject to research. We demonstrate superior results by a system which combines recurrent neural networks with convolutional neural networks in a voting approach. The gated-recurrent-unit-based neural networks are particularly well-suited to distinguish actions based on long-term information from optical tracking data; the 3D-CNNs focus more on detailed, recent information from video data. The resulting features are merged in an SVM which then classifies the movement. In this architecture, our method improves recognition rates of state-of-the-art methods by 14% on standard data sets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes