DSSTAT-MECHCRJan 18, 2018

NAE-SAT-based probabilistic membership filters

arXiv:1801.06232v1
AI Analysis

This work addresses efficiency bottlenecks in probabilistic filters for applications like large-scale data verification, though it is incremental as it builds on prior SAT filter methods.

The paper tackles the problem of slow construction and query times in SAT-based probabilistic membership filters by introducing an approach using not-all-equal (NAE) SAT formulas, mapping to random SAT for faster solving, and optimizing with bit packing and fewer hash functions, resulting in improved building and query times.

Probabilistic membership filters are a type of data structure designed to quickly verify whether an element of a large data set belongs to a subset of the data. While false negatives are not possible, false positives are. Therefore, the main goal of any good probabilistic membership filter is to have a small false-positive rate while being memory efficient and fast to query. Although Bloom filters are fast to construct, their memory efficiency is bounded by a strict theoretical upper bound. Weaver et al. introduced random satisfiability-based filters that significantly improved the efficiency of the probabilistic filters, however, at the cost of solving a complex random satisfiability (SAT) formula when constructing the filter. Here we present an improved SAT filter approach with a focus on reducing the filter building times, as well as query times. Our approach is based on using not-all-equal (NAE) SAT formulas to build the filters, solving these via a mapping to random SAT using traditionally-fast random SAT solvers, as well as bit packing and the reduction of the number of hash functions. Paired with fast hardware, NAE-SAT filters could result in enterprise-size applications.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes