CVMMApr 1, 2021

A Survey on Natural Language Video Localization

arXiv:2104.00234v19 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental survey paper that organizes and reviews existing methods for the NLVL task, which helps researchers understand the field's current state and future directions.

The paper provides a comprehensive survey of natural language video localization (NLVL) algorithms, categorizing them into supervised and weakly-supervised methods and analyzing their strengths and weaknesses.

Natural language video localization (NLVL), which aims to locate a target moment from a video that semantically corresponds to a text query, is a novel and challenging task. Toward this end, in this paper, we present a comprehensive survey of the NLVL algorithms, where we first propose the pipeline of NLVL, and then categorize them into supervised and weakly-supervised methods, following by the analysis of the strengths and weaknesses of each kind of methods. Subsequently, we present the dataset, evaluation protocols and the general performance analysis. Finally, the possible perspectives are obtained by summarizing the existing methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes