CVMay 9, 2019

DistillHash: Unsupervised Deep Hashing by Distilling Data Pairs

arXiv:1905.03465v1166 citations
AI Analysis

This addresses the challenge of efficient similarity search for large-scale data in unsupervised scenarios, representing a novel method for a known bottleneck.

The paper tackles the problem of low performance in unsupervised deep hashing due to lack of reliable similarity signals by proposing DistillHash, which distills data pairs with confident similarity signals and learns hash functions using a Bayesian framework, achieving state-of-the-art search performance on three benchmark datasets.

Due to the high storage and search efficiency, hashing has become prevalent for large-scale similarity search. Particularly, deep hashing methods have greatly improved the search performance under supervised scenarios. In contrast, unsupervised deep hashing models can hardly achieve satisfactory performance due to the lack of reliable supervisory similarity signals. To address this issue, we propose a novel deep unsupervised hashing model, dubbed DistillHash, which can learn a distilled data set consisted of data pairs, which have confidence similarity signals. Specifically, we investigate the relationship between the initial noisy similarity signals learned from local structures and the semantic similarity labels assigned by a Bayes optimal classifier. We show that under a mild assumption, some data pairs, of which labels are consistent with those assigned by the Bayes optimal classifier, can be potentially distilled. Inspired by this fact, we design a simple yet effective strategy to distill data pairs automatically and further adopt a Bayesian learning framework to learn hash functions from the distilled data set. Extensive experimental results on three widely used benchmark datasets show that the proposed DistillHash consistently accomplishes the state-of-the-art search performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes