CVAug 19, 2024

Long-Tail Temporal Action Segmentation with Group-wise Temporal Logit Adjustment

arXiv:2408.09919v16 citationsh-index: 14
Originality Incremental advance
AI Analysis

This addresses the challenge of recognizing infrequent actions in temporal segmentation for applications like video analysis, though it appears incremental as it builds on existing long-tail methods.

The paper tackles the problem of long-tailed action distribution in procedural activity videos, where existing methods fail to recognize tail actions, and proposes a G-TLA framework that significantly improves tail action segmentation without performance loss on head actions.

Procedural activity videos often exhibit a long-tailed action distribution due to varying action frequencies and durations. However, state-of-the-art temporal action segmentation methods overlook the long tail and fail to recognize tail actions. Existing long-tail methods make class-independent assumptions and struggle to identify tail classes when applied to temporal segmentation frameworks. This work proposes a novel group-wise temporal logit adjustment~(G-TLA) framework that combines a group-wise softmax formulation while leveraging activity information and action ordering for logit adjustment. The proposed framework significantly improves in segmenting tail actions without any performance loss on head actions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes