CVAug 15, 2016

Transitive Hashing Network for Heterogeneous Multimedia Retrieval

arXiv:1608.04307v127 citations
Originality Incremental advance
AI Analysis

This work addresses efficient retrieval across different media types (e.g., images and text) for applications like search engines, but it is incremental as it builds on existing cross-modal hashing methods.

The paper tackles the problem of cross-modal hashing for multimedia retrieval by relaxing the assumption that heterogeneous relationships exist within the query/database domain, instead using an auxiliary dataset; it achieves state-of-the-art performance on public datasets like NUS-WIDE and ImageNet-YahooQA.

Hashing has been widely applied to large-scale multimedia retrieval due to the storage and retrieval efficiency. Cross-modal hashing enables efficient retrieval from database of one modality in response to a query of another modality. Existing work on cross-modal hashing assumes heterogeneous relationship across modalities for hash function learning. In this paper, we relax the strong assumption by only requiring such heterogeneous relationship in an auxiliary dataset different from the query/database domain. We craft a hybrid deep architecture to simultaneously learn the cross-modal correlation from the auxiliary dataset, and align the dataset distributions between the auxiliary dataset and the query/database domain, which generates transitive hash codes for heterogeneous multimedia retrieval. Extensive experiments exhibit that the proposed approach yields state of the art multimedia retrieval performance on public datasets, i.e. NUS-WIDE, ImageNet-YahooQA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes