MMAug 9, 2021

Two-pronged Strategy: Lightweight Augmented Graph Network Hashing for Scalable Image Retrieval

arXiv:2108.03914v117 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency and accuracy challenges in unsupervised image retrieval, offering a domain-specific improvement for scalable systems.

The paper tackles the problem of high training cost and limited retrieval accuracy in unsupervised deep hashing for scalable image retrieval by proposing a lightweight augmented graph network hashing method, which reduces parameters and accelerates training while improving accuracy, achieving state-of-the-art results on benchmark datasets.

Hashing learns compact binary codes to store and retrieve massive data efficiently. Particularly, unsupervised deep hashing is supported by powerful deep neural networks and has the desirable advantage of label independence. It is a promising technique for scalable image retrieval. However, deep models introduce a large number of parameters, which is hard to optimize due to the lack of explicit semantic labels and brings considerable training cost. As a result, the retrieval accuracy and training efficiency of existing unsupervised deep hashing are still limited. To tackle the problems, in this paper, we propose a simple and efficient \emph{Lightweight Augmented Graph Network Hashing} (LAGNH) method with a two-pronged strategy. For one thing, we extract the inner structure of the image as the auxiliary semantics to enhance the semantic supervision of the unsupervised hash learning process. For another, we design a lightweight network structure with the assistance of the auxiliary semantics, which greatly reduces the number of network parameters that needs to be optimized and thus greatly accelerates the training process. Specifically, we design a cross-modal attention module based on the auxiliary semantic information to adaptively mitigate the adverse effects in the deep image features. Besides, the hash codes are learned by multi-layer message passing within an adversarial regularized graph convolutional network. Simultaneously, the semantic representation capability of hash codes is further enhanced by reconstructing the similarity graph.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes