CV LGDec 6, 2018

Tri-axial Self-Attention for Concurrent Activity Recognition

Yanyi Zhang, Xinyu Li, Kaixiang Huang, Yehan Wang, Shuhong Chen, Ivan Marsic

arXiv:1812.02817v10.9

Originality Incremental advance

AI Analysis

This work addresses the problem of recognizing multiple overlapping activities in video data, which is incremental as it builds on existing attention and transformer methods.

The paper tackles concurrent activity recognition by proposing a tri-axial self-attention system that extracts and models features for individual activities, achieving state-of-the-art or comparable performance on three standard datasets.

We present a system for concurrent activity recognition. To extract features associated with different activities, we propose a feature-to-activity attention that maps the extracted global features to sub-features associated with individual activities. To model the temporal associations of individual activities, we propose a transformer-network encoder that models independent temporal associations for each activity. To make the concurrent activity prediction aware of the potential associations between activities, we propose self-attention with an association mask. Our system achieved state-of-the-art or comparable performance on three commonly used concurrent activity detection datasets. Our visualizations demonstrate that our system is able to locate the important spatial-temporal features for final decision making. We also showed that our system can be applied to general multilabel classification problems.

View on arXiv PDF

Similar