CVLGNov 17, 2019

Transductive Zero-Shot Hashing for Multilabel Image Retrieval

arXiv:1911.07192v2
Originality Incremental advance
AI Analysis

This addresses a gap in zero-shot hashing for multi-label image retrieval, which is incremental as it extends existing single-label methods to handle multi-label scenarios.

The paper tackles the problem of retrieving multi-label images with undefined semantic labels (unseen images) by proposing a transductive zero-shot hashing method, achieving significantly better results than competing methods on three popular datasets.

Hash coding has been widely used in approximate nearest neighbor search for large-scale image retrieval. Given semantic annotations such as class labels and pairwise similarities of the training data, hashing methods can learn and generate effective and compact binary codes. While some newly introduced images may contain undefined semantic labels, which we call unseen images, zeor-shot hashing techniques have been studied. However, existing zeor-shot hashing methods focus on the retrieval of single-label images, and cannot handle multi-label images. In this paper, for the first time, a novel transductive zero-shot hashing method is proposed for multi-label unseen image retrieval. In order to predict the labels of the unseen/target data, a visual-semantic bridge is built via instance-concept coherence ranking on the seen/source data. Then, pairwise similarity loss and focal quantization loss are constructed for training a hashing model using both the seen/source and unseen/target data. Extensive evaluations on three popular multi-label datasets demonstrate that, the proposed hashing method achieves significantly better results than the competing methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes