CVFeb 27, 2018

Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition

arXiv:1802.09834v1185 citations
Originality Incremental advance
AI Analysis

This work addresses action recognition from skeleton data, a domain-specific problem, with incremental improvements over existing methods.

The paper tackles skeleton-based action recognition by proposing a spatio-temporal graph convolution (STGC) approach that combines local convolutional filtering and sequence learning, achieving improved state-of-the-art results on benchmark datasets including NTU RGB+D.

Variations of human body skeletons may be considered as dynamic graphs, which are generic data representation for numerous real-world applications. In this paper, we propose a spatio-temporal graph convolution (STGC) approach for assembling the successes of local convolutional filtering and sequence learning ability of autoregressive moving average. To encode dynamic graphs, the constructed multi-scale local graph convolution filters, consisting of matrices of local receptive fields and signal mappings, are recursively performed on structured graph data of temporal and spatial domain. The proposed model is generic and principled as it can be generalized into other dynamic models. We theoretically prove the stability of STGC and provide an upper-bound of the signal transformation to be learnt. Further, the proposed recursive model can be stacked into a multi-layer architecture. To evaluate our model, we conduct extensive experiments on four benchmark skeleton-based action datasets, including the large-scale challenging NTU RGB+D. The experimental results demonstrate the effectiveness of our proposed model and the improvement over the state-of-the-art.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes