CVJan 6, 2024

Group Activity Recognition using Unreliable Tracked Pose

arXiv:2401.03262v16 citationsh-index: 11Neural computing & applications (Print)
Originality Incremental advance
AI Analysis

This work addresses a practical limitation in group activity recognition for video analysis, offering an incremental improvement by making models more robust to real-world tracking errors.

The paper tackles the problem of group activity recognition in video by addressing the reliance on high-quality tracking, introducing RePGARS, a deep learning approach tolerant to unreliable tracking and pose information, which outperforms existing methods not using ground truth tracking.

Group activity recognition in video is a complex task due to the need for a model to recognise the actions of all individuals in the video and their complex interactions. Recent studies propose that optimal performance is achieved by individually tracking each person and subsequently inputting the sequence of poses or cropped images/optical flow into a model. This helps the model to recognise what actions each person is performing before they are merged to arrive at the group action class. However, all previous models are highly reliant on high quality tracking and have only been evaluated using ground truth tracking information. In practice it is almost impossible to achieve highly reliable tracking information for all individuals in a group activity video. We introduce an innovative deep learning-based group activity recognition approach called Rendered Pose based Group Activity Recognition System (RePGARS) which is designed to be tolerant of unreliable tracking and pose information. Experimental results confirm that RePGARS outperforms all existing group activity recognition algorithms tested which do not use ground truth detection and tracking information.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes