CVCLMMMay 1, 2017

Query-adaptive Video Summarization via Quality-aware Relevance Estimation

arXiv:1705.00581v2100 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of generating video summaries that highlight elements relevant to a search query, which is incremental as it builds on existing summarization techniques by adding query adaptation and quality awareness.

The paper tackles query-relevant video summarization by framing it as a subset selection problem to optimize for diversity, representativeness, and query relevance, using a neural network to measure relevance in a textual-visual embedding space and incorporating frame quality; it outperforms previous state-of-the-art methods on relevance prediction and standard baselines like Maximal Marginal Relevance on a new annotated dataset.

Although the problem of automatic video summarization has recently received a lot of attention, the problem of creating a video summary that also highlights elements relevant to a search query has been less studied. We address this problem by posing query-relevant summarization as a video frame subset selection problem, which lets us optimise for summaries which are simultaneously diverse, representative of the entire video, and relevant to a text query. We quantify relevance by measuring the distance between frames and queries in a common textual-visual semantic embedding space induced by a neural network. In addition, we extend the model to capture query-independent properties, such as frame quality. We compare our method against previous state of the art on textual-visual embeddings for thumbnail selection and show that our model outperforms them on relevance prediction. Furthermore, we introduce a new dataset, annotated with diversity and query-specific relevance labels. On this dataset, we train and test our complete model for video summarization and show that it outperforms standard baselines such as Maximal Marginal Relevance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes