CVDec 21, 2016

Temporal Tessellation: A Unified Approach for Video Analysis

arXiv:1612.06950v229 citations
AI Analysis

It addresses the problem of diverse video analysis tasks for researchers and practitioners by offering a single method applicable across multiple domains, though it is incremental as it builds on semantic transfer techniques from 2D image analysis.

The paper tackles video understanding by proposing a unified approach that transfers semantics from reference videos to test videos based on clip correspondences, achieving state-of-the-art results in four out of five benchmarks, including LSMDC'16 for captioning and Thumos2014 for action detection.

We present a general approach to video understanding, inspired by semantic transfer techniques that have been successfully used for 2D image analysis. Our method considers a video to be a 1D sequence of clips, each one associated with its own semantics. The nature of these semantics -- natural language captions or other labels -- depends on the task at hand. A test video is processed by forming correspondences between its clips and the clips of reference videos with known semantics, following which, reference semantics can be transferred to the test video. We describe two matching methods, both designed to ensure that (a) reference clips appear similar to test clips and (b), taken together, the semantics of the selected reference clips is consistent and maintains temporal coherence. We use our method for video captioning on the LSMDC'16 benchmark, video summarization on the SumMe and TVSum benchmarks, Temporal Action Detection on the Thumos2014 benchmark, and sound prediction on the Greatest Hits benchmark. Our method not only surpasses the state of the art, in four out of five benchmarks, but importantly, it is the only single method we know of that was successfully applied to such a diverse range of tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes