IRMMApr 30, 2019

Effective and Efficient Indexing in Cross-Modal Hashing-Based Datasets

arXiv:1904.13325v26 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of real-time and accurate cross-modal retrieval for multimedia applications, representing an incremental improvement over existing hashing methods.

The paper tackles the problem of inefficient and inaccurate nearest neighbor search in cross-modal hashing-based datasets by proposing a probability-based index scheme that uses a few binary bits as index codes, constructing an inverted index table, and training a neural network, which effectively boosts performance on benchmark datasets for image-text retrieval.

To overcome the barrier of storage and computation, the hashing technique has been widely used for nearest neighbor search in multimedia retrieval applications recently. Particularly, cross-modal retrieval that searches across different modalities becomes an active but challenging problem. Although dozens of cross-modal hashing algorithms are proposed to yield compact binary codes, the exhaustive search is impractical for the real-time purpose, and Hamming distance computation suffers inaccurate results. In this paper, we propose a novel search method that utilizes a probability-based index scheme over binary hash codes in cross-modal retrieval. The proposed hash code indexing scheme exploits a few binary bits of the hash code as the index code. We construct an inverted index table based on index codes and train a neural network to improve the indexing accuracy and efficiency. Experiments are performed on two benchmark datasets for retrieval across image and text modalities, where hash codes are generated by three cross-modal hashing methods. Results show the proposed method effectively boost the performance on these hash methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes