CVJul 11, 2016

Efficient Activity Detection in Untrimmed Video with Max-Subgraph Search

arXiv:1607.02815v117 citations
Originality Incremental advance
AI Analysis

This addresses the problem of efficiently localizing activities in videos for computer vision applications, representing an incremental improvement in search strategies.

The paper tackles activity detection in untrimmed videos by framing it as a maximum-weight connected subgraph problem, resulting in a fast method that improves accuracy over existing strategies on four datasets.

We propose an efficient approach for activity detection in video that unifies activity categorization with space-time localization. The main idea is to pose activity detection as a maximum-weight connected subgraph problem. Offline, we learn a binary classifier for an activity category using positive video exemplars that are "trimmed" in time to the activity of interest. Then, given a novel \emph{untrimmed} video sequence, we decompose it into a 3D array of space-time nodes, which are weighted based on the extent to which their component features support the learned activity model. To perform detection, we then directly localize instances of the activity by solving for the maximum-weight connected subgraph in the test video's space-time graph. We show that this detection strategy permits an efficient branch-and-cut solution for the best-scoring---and possibly non-cubically shaped---portion of the video for a given activity classifier. The upshot is a fast method that can search a broader space of space-time region candidates than was previously practical, which we find often leads to more accurate detection. We demonstrate the proposed algorithm on four datasets, and we show its speed and accuracy advantages over multiple existing search strategies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes