IRCVNov 19, 2014

Efficient Media Retrieval from Non-Cooperative Queries

arXiv:1411.5307v11 citations
Originality Synthesis-oriented
AI Analysis

This addresses a domain-specific problem for media retrieval systems, but it is incremental as it builds on existing retrieval features like VLAD.

The paper tackles the problem of retrieving book covers from poorly-conditioned query images by constructing a large-scale dataset with 100K distractor covers and proposing a method that combines noisy OCR text matching with VLAD features. The result is a significant improvement in retrieval accuracy over using either VLAD or text alone.

Text is ubiquitous in the artificial world and easily attainable when it comes to book title and author names. Using the images from the book cover set from the Stanford Mobile Visual Search dataset and additional book covers and metadata from openlibrary.org, we construct a large scale book cover retrieval dataset, complete with 100K distractor covers and title and author strings for each. Because our query images are poorly conditioned for clean text extraction, we propose a method for extracting a matching noisy and erroneous OCR readings and matching it against clean author and book title strings in a standard document look-up problem setup. Finally, we demonstrate how to use this text-matching as a feature in conjunction with popular retrieval features such as VLAD using a simple learning setup to achieve significant improvements in retrieval accuracy over that of either VLAD or the text alone.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes