CV IRApr 23, 2018

Deep Semantic Hashing with Generative Adversarial Networks

Zhaofan Qiu, Yingwei Pan, Ting Yao, Tao Mei

arXiv:1804.08275v113.082 citations

Originality Incremental advance

AI Analysis

This addresses the problem of scalable and robust image retrieval for domains with limited labeled data, though it is incremental as it builds on existing GAN and hashing techniques.

The paper tackles the limitations of supervised hashing in image retrieval, such as annotation costs and robustness issues due to distribution shifts, by generating synthetic data using semi-supervised GANs to improve hashing quality. It presents DSH-GANs, which achieves superior results on benchmarks like CIFAR-10 and NUS-WIDE compared to state-of-the-art methods.

Hashing has been a widely-adopted technique for nearest neighbor search in large-scale image retrieval tasks. Recent research has shown that leveraging supervised information can lead to high quality hashing. However, the cost of annotating data is often an obstacle when applying supervised hashing to a new domain. Moreover, the results can suffer from the robustness problem as the data at training and test stage could come from similar but different distributions. This paper studies the exploration of generating synthetic data through semi-supervised generative adversarial networks (GANs), which leverages largely unlabeled and limited labeled training data to produce highly compelling data with intrinsic invariance and global coherence, for better understanding statistical structures of natural data. We demonstrate that the above two limitations can be well mitigated by applying the synthetic data for hashing. Specifically, a novel deep semantic hashing with GANs (DSH-GANs) is presented, which mainly consists of four components: a deep convolution neural networks (CNN) for learning image representations, an adversary stream to distinguish synthetic images from real ones, a hash stream for encoding image representations to hash codes and a classification stream. The whole architecture is trained end-to-end by jointly optimizing three losses, i.e., adversarial loss to correct label of synthetic or real for each sample, triplet ranking loss to preserve the relative similarity ordering in the input real-synthetic triplets and classification loss to classify each sample accurately. Extensive experiments conducted on both CIFAR-10 and NUS-WIDE image benchmarks validate the capability of exploiting synthetic images for hashing. Our framework also achieves superior results when compared to state-of-the-art deep hash models.

View on arXiv PDF

Similar