CVMay 31, 2022

Skeleton-based Action Recognition via Temporal-Channel Aggregation

arXiv:2205.15936v232 citationsh-index: 25
Originality Incremental advance
AI Analysis

This work addresses a bottleneck in skeleton-based action recognition for applications like human-computer interaction, though it appears incremental in improving feature aggregation.

The paper tackles the problem of effectively combining spatial and temporal features in skeleton-based action recognition by proposing a Temporal-Channel Aggregation Graph Convolutional Network (TCA-GCN), which outperforms state-of-the-art methods on multiple datasets.

Skeleton-based action recognition methods are limited by the semantic extraction of spatio-temporal skeletal maps. However, current methods have difficulty in effectively combining features from both temporal and spatial graph dimensions and tend to be thick on one side and thin on the other. In this paper, we propose a Temporal-Channel Aggregation Graph Convolutional Networks (TCA-GCN) to learn spatial and temporal topologies dynamically and efficiently aggregate topological features in different temporal and channel dimensions for skeleton-based action recognition. We use the Temporal Aggregation module to learn temporal dimensional features and the Channel Aggregation module to efficiently combine spatial dynamic channel-wise topological features with temporal dynamic topological features. In addition, we extract multi-scale skeletal features on temporal modeling and fuse them with an attention mechanism. Extensive experiments show that our model results outperform state-of-the-art methods on the NTU RGB+D, NTU RGB+D 120, and NW-UCLA datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes