CVJul 18, 2018

Video Time: Properties, Encoders and Evaluation

arXiv:1807.06980v130 citations
Originality Incremental advance
AI Analysis

This work addresses a gap in video understanding by providing a meta-analysis to evaluate video time encoders, which is incremental as it builds on existing methods but introduces new evaluation tasks.

The paper tackles the problem of quantifying video time by defining three properties (temporal asymmetry, continuity, and causality) and formulating tasks to assess them, evaluating encoders like C3D and LSTM and proposing a new encoder that performs better on these tasks.

Time-aware encoding of frame sequences in a video is a fundamental problem in video understanding. While many attempted to model time in videos, an explicit study on quantifying video time is missing. To fill this lacuna, we aim to evaluate video time explicitly. We describe three properties of video time, namely a) temporal asymmetry, b)temporal continuity and c) temporal causality. Based on each we formulate a task able to quantify the associated property. This allows assessing the effectiveness of modern video encoders, like C3D and LSTM, in their ability to model time. Our analysis provides insights about existing encoders while also leading us to propose a new video time encoder, which is better suited for the video time recognition tasks than C3D and LSTM. We believe the proposed meta-analysis can provide a reasonable baseline to assess video time encoders on equal grounds on a set of temporal-aware tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes