CVJun 13, 2023

A Survey on Video Moment Localization

arXiv:2306.07515v147 citationsh-index: 77
Originality Synthesis-oriented
AI Analysis

This is a survey paper, so it is incremental, summarizing existing work for researchers in video understanding and retrieval.

The paper provides a comprehensive review of video moment localization techniques, covering supervised, weakly supervised, and unsupervised methods, along with datasets and future directions.

Video moment localization, also known as video moment retrieval, aiming to search a target segment within a video described by a given natural language query. Beyond the task of temporal action localization whereby the target actions are pre-defined, video moment retrieval can query arbitrary complex activities. In this survey paper, we aim to present a comprehensive review of existing video moment localization techniques, including supervised, weakly supervised, and unsupervised ones. We also review the datasets available for video moment localization and group results of related work. In addition, we discuss promising future directions for this field, in particular large-scale datasets and interpretable video moment localization models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes