IR CVDec 25, 2020

Comprehensive Graph-conditional Similarity Preserving Network for Unsupervised Cross-modal Hashing

Jun Yu, Hao Zhou, Yibing Zhan, Dacheng Tao

arXiv:2012.13538v111.1184 citationsHas Code

Originality Incremental advance

AI Analysis

This work provides an incremental improvement in cross-modal retrieval accuracy for users of unsupervised hashing methods.

This paper addresses the inaccurate similarity problem in unsupervised cross-modal hashing (UCMH) by exploring intrinsic data relationships through a graph-neighbor coherence preserving network (DGCPN). DGCPN achieved superior performance, improving the mean average precision from 0.722 to 0.751 on MIRFlickr-25K for 64-bit hashing codes when retrieving texts from images.

Unsupervised cross-modal hashing (UCMH) has become a hot topic recently. Current UCMH focuses on exploring data similarities. However, current UCMH methods calculate the similarity between two data, mainly relying on the two data's cross-modal features. These methods suffer from inaccurate similarity problems that result in a suboptimal retrieval Hamming space, because the cross-modal features between the data are not sufficient to describe the complex data relationships, such as situations where two data have different feature representations but share the inherent concepts. In this paper, we devise a deep graph-neighbor coherence preserving network (DGCPN). Specifically, DGCPN stems from graph models and explores graph-neighbor coherence by consolidating the information between data and their neighbors. DGCPN regulates comprehensive similarity preserving losses by exploiting three types of data similarities (i.e., the graph-neighbor coherence, the coexistent similarity, and the intra- and inter-modality consistency) and designs a half-real and half-binary optimization strategy to reduce the quantization errors during hashing. Essentially, DGCPN addresses the inaccurate similarity problem by exploring and exploiting the data's intrinsic relationships in a graph. We conduct extensive experiments on three public UCMH datasets. The experimental results demonstrate the superiority of DGCPN, e.g., by improving the mean average precision from 0.722 to 0.751 on MIRFlickr-25K using 64-bit hashing codes to retrieve texts from images. We will release the source code package and the trained model on https://github.com/Atmegal/DGCPN.

View on arXiv PDF Code

Similar