IRApr 25, 2019

Fusion-supervised Deep Cross-modal Hashing

arXiv:1904.11171v218 citations
Originality Incremental advance
AI Analysis

This work addresses cross-modal retrieval for applications like multimedia search, but it appears incremental as it builds on existing deep hashing methods with specific enhancements.

The paper tackles the problem of cross-modal retrieval by proposing a fusion-supervised deep hashing method that learns unified binary codes to capture heterogeneous multi-modal correlations and embed semantic information, achieving state-of-the-art performance on two benchmark datasets.

Deep hashing has recently received attention in cross-modal retrieval for its impressive advantages. However, existing hashing methods for cross-modal retrieval cannot fully capture the heterogeneous multi-modal correlation and exploit the semantic information. In this paper, we propose a novel \emph{Fusion-supervised Deep Cross-modal Hashing} (FDCH) approach. Firstly, FDCH learns unified binary codes through a fusion hash network with paired samples as input, which effectively enhances the modeling of the correlation of heterogeneous multi-modal data. Then, these high-quality unified hash codes further supervise the training of the modality-specific hash networks for encoding out-of-sample queries. Meanwhile, both pair-wise similarity information and classification information are embedded in the hash networks under one stream framework, which simultaneously preserves cross-modal similarity and keeps semantic consistency. Experimental results on two benchmark datasets demonstrate the state-of-the-art performance of FDCH.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes