CVApr 3, 2018

When will you do what? - Anticipating Temporal Occurrences of Activities

arXiv:1804.00892v1216 citations
Originality Incremental advance
AI Analysis

This addresses the need for practical applications in video analysis by enabling long-term activity anticipation, which is incremental as it builds on existing short-term prediction methods.

The paper tackles the problem of making long-term predictions of future actions and their durations in videos, beyond just a few seconds, and shows that their CNN and RNN methods generate accurate predictions even for long videos with many different actions and noisy input.

Analyzing human actions in videos has gained increased attention recently. While most works focus on classifying and labeling observed video frames or anticipating the very recent future, making long-term predictions over more than just a few seconds is a task with many practical applications that has not yet been addressed. In this paper, we propose two methods to predict a considerably large amount of future actions and their durations. Both, a CNN and an RNN are trained to learn future video labels based on previously seen content. We show that our methods generate accurate predictions of the future even for long videos with a huge amount of different actions and can even deal with noisy or erroneous input information.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes