LGCLMLJun 16, 2020

Generative Semantic Hashing Enhanced via Boltzmann Machines

arXiv:2006.08858v11000 citations
Originality Incremental advance
AI Analysis

This addresses the limitation of factorized posteriors in generative hashing for improved retrieval efficiency, though it is an incremental improvement over existing methods.

The paper tackles the problem of enforcing independence among bits in generative semantic hashing by introducing correlations via a Boltzmann machine posterior, resulting in significant performance gains in large-scale information retrieval.

Generative semantic hashing is a promising technique for large-scale information retrieval thanks to its fast retrieval speed and small memory footprint. For the tractability of training, existing generative-hashing methods mostly assume a factorized form for the posterior distribution, enforcing independence among the bits of hash codes. From the perspectives of both model representation and code space size, independence is always not the best assumption. In this paper, to introduce correlations among the bits of hash codes, we propose to employ the distribution of Boltzmann machine as the variational posterior. To address the intractability issue of training, we first develop an approximate method to reparameterize the distribution of a Boltzmann machine by augmenting it as a hierarchical concatenation of a Gaussian-like distribution and a Bernoulli distribution. Based on that, an asymptotically-exact lower bound is further derived for the evidence lower bound (ELBO). With these novel techniques, the entire model can be optimized efficiently. Extensive experimental results demonstrate that by effectively modeling correlations among different bits within a hash code, our model can achieve significant performance gains.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes