CNN-VWII: An Efficient Approach for Large-Scale Video Retrieval by Image Queries
This addresses the problem of efficient video retrieval for users needing to search large video datasets by image queries, representing an incremental improvement in speed.
The paper tackles large-scale video retrieval using a query image by combining CNNs and Bag of Visual Words with a visual weighted inverted index, achieving up to an order of magnitude speed improvement over state-of-the-art methods while maintaining similar accuracy.
This paper aims to solve the problem of large-scale video retrieval by a query image. Firstly, we define the problem of top-$k$ image to video query. Then, we combine the merits of convolutional neural networks(CNN for short) and Bag of Visual Word(BoVW for short) module to design a model for video frames information extraction and representation. In order to meet the requirements of large-scale video retrieval, we proposed a visual weighted inverted index(VWII for short) and related algorithm to improve the efficiency and accuracy of retrieval process. Comprehensive experiments show that our proposed technique achieves substantial improvements (up to an order of magnitude speed up) over the state-of-the-art techniques with similar accuracy.