A General Technique for Searching in Implicit Sets via Function Inversion
This work provides a general black-box technique for data structure design in computational geometry and fine-grained complexity, with incremental improvements over existing methods for specific applications.
The paper tackles the problem of searching in implicit sets defined as function images by applying the Fiat-Naor function inversion scheme, resulting in data structures with space complexity Õ(N^(1-α/3)) and query time Õ(N^α) for functions computable in constant time. It generalizes these techniques to handle various queries like range counting and geometric computations, unifying prior results on problems such as 3SUM-indexing and string searching.
In recent years, the Fiat-Naor function inversion scheme has been used to disprove conjectures in fine-grained complexity theory and design state of the art data structures for a number of combinatorial problems. We pursue this line of research by considering its application to data structures for searching in implicit sets, defined as the image of a function. We show that, if $f$ is of the form $[N]\to [2^{w}]^d$ for some $w=polylog(N)$ and is computable in constant time, then, for any $0<α<1$, we can obtain a data structure using $Õ(N^{1-α/3})$ space such that, for a given $d$-dimensional axis-aligned box $B$, we can search for some $x\in [N]$ such that $f(x) \in B$ in time $Õ(N^α)$. (Here the $Õ(.)$ notation omits polylogarithmic factors.) Using similar techniques, we further obtain - data structures for range counting and reporting, predecessor, selection, ranking queries, and combinations thereof, on the set $f([N])$, - data structures for preimage size and preimage selection queries for a given value of $f$, and - data structures for selection and ranking queries on geometric quantities computed from tuples of points in $d$-space. These results unify and generalize previously known results on 3SUM-indexing and string searching, and are widely applicable as a black box to a variety of problems. In particular, we give a data structure for a generalized version of gapped string indexing, and show how to preprocess a set of points on an integer grid in order to efficiently compute (in sublinear time), for points contained in a given axis-aligned box, their Theil-Sen estimator, the $k$th largest area triangle, or the induced hyperplane that is the $k$th furthest from the origin.