CVFeb 1, 2018

Deep-Temporal LSTM for Daily Living Action Recognition

arXiv:1802.00421v212 citations
AI Analysis

This work addresses action recognition for daily living scenarios, presenting an incremental improvement by combining spatial and temporal modeling.

The paper tackled the problem of daily living action recognition by proposing a deep-temporal LSTM architecture that fuses 3D skeleton geometry with deep static appearance, achieving competitive performance on datasets like CAD60, MSRDailyActivity3D, and NTU-RGB+D.

In this paper, we propose to improve the traditional use of RNNs by employing a many to many model for video classification. We analyze the importance of modeling spatial layout and temporal encoding for daily living action recognition. Many RGB methods focus only on short term temporal information obtained from optical flow. Skeleton based methods on the other hand show that modeling long term skeleton evolution improves action recognition accuracy. In this work, we propose a deep-temporal LSTM architecture which extends standard LSTM and allows better encoding of temporal information. In addition, we propose to fuse 3D skeleton geometry with deep static appearance. We validate our approach on public available CAD60, MSRDailyActivity3D and NTU-RGB+D, achieving competitive performance as compared to the state-of-the art.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes