Bloom Filters in Adversarial Environments
This work addresses security vulnerabilities in randomized data structures like Bloom filters for applications in adversarial settings, such as cybersecurity or databases, by establishing foundational cryptographic connections and providing tight bounds, though it is incremental in extending existing models.
The paper tackles the problem of designing Bloom filters that remain secure when adversaries can adaptively choose inputs based on previous responses, showing that for computationally bounded adversaries, non-trivial Bloom filters exist if and only if one-way functions exist, and for unbounded adversaries, they achieve a memory usage of O(n log(1/ε) + t) bits for sets of size n and error ε against t queries.
Many efficient data structures use randomness, allowing them to improve upon deterministic ones. Usually, their efficiency and correctness are analyzed using probabilistic tools under the assumption that the inputs and queries are independent of the internal randomness of the data structure. In this work, we consider data structures in a more robust model, which we call the adversarial model. Roughly speaking, this model allows an adversary to choose inputs and queries adaptively according to previous responses. Specifically, we consider a data structure known as "Bloom filter" and prove a tight connection between Bloom filters in this model and cryptography. A Bloom filter represents a set $S$ of elements approximately, by using fewer bits than a precise representation. The price for succinctness is allowing some errors: for any $x \in S$ it should always answer `Yes', and for any $x \notin S$ it should answer `Yes' only with small probability. In the adversarial model, we consider both efficient adversaries (that run in polynomial time) and computationally unbounded adversaries that are only bounded in the number of queries they can make. For computationally bounded adversaries, we show that non-trivial (memory-wise) Bloom filters exist if and only if one-way functions exist. For unbounded adversaries we show that there exists a Bloom filter for sets of size $n$ and error $\varepsilon$, that is secure against $t$ queries and uses only $O(n \log{\frac{1}{\varepsilon}}+t)$ bits of memory. In comparison, $n\log{\frac{1}{\varepsilon}}$ is the best possible under a non-adaptive adversary.