CVDec 5, 2017

Learning Latent Super-Events to Detect Multiple Activities in Videos

arXiv:1712.01938v296 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of analyzing complex, real-world videos like surveillance footage for multiple activities, representing an incremental improvement in video activity detection.

The paper tackles the problem of detecting multiple activities in continuous, unsegmented videos by introducing latent super-events, which capture temporal relationships among events, and shows that this approach significantly improves activity detection, advancing state-of-the-art results on multiple public datasets.

In this paper, we introduce the concept of learning latent super-events from activity videos, and present how it benefits activity detection in continuous videos. We define a super-event as a set of multiple events occurring together in videos with a particular temporal organization; it is the opposite concept of sub-events. Real-world videos contain multiple activities and are rarely segmented (e.g., surveillance videos), and learning latent super-events allows the model to capture how the events are temporally related in videos. We design temporal structure filters that enable the model to focus on particular sub-intervals of the videos, and use them together with a soft attention mechanism to learn representations of latent super-events. Super-event representations are combined with per-frame or per-segment CNNs to provide frame-level annotations. Our approach is designed to be fully differentiable, enabling end-to-end learning of latent super-event representations jointly with the activity detector using them. Our experiments with multiple public video datasets confirm that the proposed concept of latent super-event learning significantly benefits activity detection, advancing the state-of-the-arts.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes