CVMay 4, 2021

Fusing Higher-order Features in Graph Neural Networks for Skeleton-based Action Recognition

arXiv:2105.01563v598 citationsHas Code
Originality Incremental advance
AI Analysis

This work improves skeleton-based action recognition for edge devices by addressing confusion in similar motion trajectories, though it is incremental as it builds on existing spatial-temporal graph neural networks.

The paper tackles the problem of action recognition from skeleton sequences by fusing higher-order angular features into graph neural networks, achieving new state-of-the-art accuracy on NTU60 and NTU120 benchmarks with fewer parameters and reduced run time.

Skeleton sequences are lightweight and compact, and thus are ideal candidates for action recognition on edge devices. Recent skeleton-based action recognition methods extract features from 3D joint coordinates as spatial-temporal cues, using these representations in a graph neural network for feature fusion to boost recognition performance. The use of first- and second-order features, i.e., joint and bone representations, has led to high accuracy. Nonetheless, many models are still confused by actions that have similar motion trajectories. To address these issues, we propose fusing higher-order features in the form of angular encoding into modern architectures to robustly capture the relationships between joints and body parts. This simple fusion with popular spatial-temporal graph neural networks achieves new state-of-the-art accuracy in two large benchmarks, including NTU60 and NTU120, while employing fewer parameters and reduced run time. Our source code is publicly available at: https://github.com/ZhenyueQin/Angular-Skeleton-Encoding.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes