DCIRSINov 23, 2015

NearBucket-LSH: Efficient Similarity Search in P2P Networks

arXiv:1511.07148v18 citations
Originality Incremental advance
AI Analysis

This work addresses network efficiency for similarity search in distributed online social networks, presenting an incremental improvement over existing methods.

The paper tackles the problem of efficient similarity search in large-scale peer-to-peer networks by minimizing network cost while maintaining search quality, achieving over 50% improvement in search quality for a given network cost in many cases.

We present NearBucket-LSH, an effective algorithm for similarity search in large-scale distributed online social networks organized as peer-to-peer overlays. As communication is a dominant consideration in distributed systems, we focus on minimizing the network cost while guaranteeing good search quality. Our algorithm is based on Locality Sensitive Hashing (LSH), which limits the search to collections of objects, called buckets, that have a high probability to be similar to the query. More specifically, NearBucket-LSH employs an LSH extension that searches in near buckets, and improves search quality but also significantly increases the network cost. We decrease the network cost by considering the internals of both LSH and the P2P overlay, and harnessing their properties to our needs. We show that our NearBucket-LSH increases search quality for a given network cost compared to previous art. In many cases, the search quality increases by more than 50%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes