CV LG IVNov 25, 2019

Oops! Predicting Unintentional Action in Video

Dave Epstein, Boyuan Chen, Carl Vondrick

arXiv:1911.11206v123.4119 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of understanding human intentionality in video analysis, which is incremental as it builds on existing action recognition methods with a new dataset and tasks.

The researchers tackled the problem of predicting unintentional actions in videos by introducing a dataset and tasks for recognition, localization, and anticipation, and found that a self-supervised approach using video speed performed competitively with supervised methods, but a significant gap between machine and human performance remained.

From just a short glance at a video, we can often tell whether a person's action is intentional or not. Can we train a model to recognize this? We introduce a dataset of in-the-wild videos of unintentional action, as well as a suite of tasks for recognizing, localizing, and anticipating its onset. We train a supervised neural network as a baseline and analyze its performance compared to human consistency on the tasks. We also investigate self-supervised representations that leverage natural signals in our dataset, and show the effectiveness of an approach that uses the intrinsic speed of video to perform competitively with highly-supervised pretraining. However, a significant gap between machine and human performance remains. The project website is available at https://oops.cs.columbia.edu

View on arXiv PDF

Similar