CVLGIVNov 25, 2019

Oops! Predicting Unintentional Action in Video

arXiv:1911.11206v1119 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of understanding human intentionality in video analysis, which is incremental as it builds on existing action recognition methods with a new dataset and tasks.

The researchers tackled the problem of predicting unintentional actions in videos by introducing a dataset and tasks for recognition, localization, and anticipation, and found that a self-supervised approach using video speed performed competitively with supervised methods, but a significant gap between machine and human performance remained.

From just a short glance at a video, we can often tell whether a person's action is intentional or not. Can we train a model to recognize this? We introduce a dataset of in-the-wild videos of unintentional action, as well as a suite of tasks for recognizing, localizing, and anticipating its onset. We train a supervised neural network as a baseline and analyze its performance compared to human consistency on the tasks. We also investigate self-supervised representations that leverage natural signals in our dataset, and show the effectiveness of an approach that uses the intrinsic speed of video to perform competitively with highly-supervised pretraining. However, a significant gap between machine and human performance remains. The project website is available at https://oops.cs.columbia.edu

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes