Metric Dimension and Resolvability of Jaccard Spaces
This work addresses a theoretical problem in metric geometry and combinatorics, with potential applications in classification of symbolic objects, but it is incremental as it builds on existing concepts of resolvability.
The paper tackles the problem of finding minimal resolving sets for Jaccard spaces, which are metric spaces of subsets with Jaccard distance, and shows that the metric dimension is Θ(|X|/ln|X|) and that smaller subsets can resolve pairs of subsets up to a certain size with high probability.
A subset of points in a metric space is said to resolve it if each point in the space is uniquely characterized by its distance to each point in the subset. In particular, resolving sets can be used to represent points in abstract metric spaces as Euclidean vectors. Importantly, due to the triangle inequality, points close by in the space are represented as vectors with similar coordinates, which may find applications in classification problems of symbolic objects under suitably chosen metrics. In this manuscript, we address the resolvability of Jaccard spaces, i.e., metric spaces of the form $(2^X,\text{Jac})$, where $2^X$ is the power set of a finite set $X$, and $\text{Jac}$ is the Jaccard distance between subsets of $X$. Specifically, for different $a,b\in 2^X$, $\text{Jac}(a,b)=|aΔb|/|a\cup b|$, where $|\cdot|$ denotes size (i.e., cardinality) and $Δ$ denotes the symmetric difference of sets. We combine probabilistic and linear algebra arguments to construct highly likely but nearly optimal (i.e., of minimal size) resolving sets of $(2^X,\text{Jac})$. In particular, we show that the metric dimension of $(2^X,\text{Jac})$, i.e., the minimum size of a resolving set of this space, is $Θ(|X|/\ln|X|)$. In addition, we show that a much smaller subset of $2^X$ suffices to resolve, with high probability, all different pairs of subsets of $X$ of cardinality at most $\sqrt{|X|}/\ln|X|$, up to a factor.