LGCVDec 31, 2020

Searching a Raw Video Database using Natural Language Queries

arXiv:2012.15565v1
AI Analysis

This work aims to improve video searchability for users of video streaming platforms, which is an incremental improvement to existing search paradigms.

This paper addresses the challenge of searching large video databases using natural language queries. It proposes an end-to-end pipeline that leverages recurrent and convolutional neural networks to generate captions for video clips, enabling text-based search.

The number of videos being produced and consequently stored in databases for video streaming platforms has been increasing exponentially over time. This vast database should be easily index-able to find the requisite clip or video to match the given search specification, preferably in the form of a textual query. This work aims to provide an end-to-end pipeline to search a video database with a voice query from the end user. The pipeline makes use of Recurrent Neural Networks in combination with Convolutional Neural Networks to generate captions of the video clips present in the database.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes