11.8DMApr 22
Constant Rate Isometric Embeddings of Hamming Metric into Edit MetricSudatta Bhattacharya, Sanjana Dey, Elazar Goldenberg et al.
A function $φ:\{0,1\}^n \to \{0,1\}^N$ is called an isometric embedding of the $n$-dimensional Hamming metric space to the $N$-dimensional edit metric space if, for all $x,y\in\{0,1\}^n$, the Hamming distance between $x$ and $y$ is equal to the edit distance between $φ(x)$ and $φ(y)$. The rate of such an embedding is defined as the ratio $n/N$. It is well known in the literature how to construct isometric embeddings with rate $Ω(1/\log n)$. However, achieving even near-isometric embeddings with positive constant rate has remained elusive until now. In this paper, we present an isometric embedding with rate $1/8$ by discovering connections to synchronization strings, which were studied in the context of insertion-deletion codes (Haeupler-Shahrasbi [JACM'21]). At a technical level, we introduce a framework for obtaining high-rate isometric embeddings using a novel object called misaligners. As an immediate consequence of our constant-rate isometric embedding, we improve known conditional lower bounds for various optimization problems in the edit metric, now with optimal dependence on the dimension. We complement our results by showing that no isometric embedding $φ:\{0,1\}^n \to \{0,1\}^N$ can have rate greater than $15/32$ for all positive integers $n$. En route to proving this upper bound, we uncover fundamental structural properties necessary for every Hamming-to-edit isometric embedding. We also prove similar upper and lower bounds for embeddings over larger alphabets. Finally, we consider embeddings $φ:Σ_{\mathrm{in}}^n \to Σ_{\mathrm{out}}^N$ between different input and output alphabets, where the rate is given by $\frac{n\log|Σ_{\mathrm{in}}|}{N\log|Σ_{\mathrm{out}}|}$. In this setting, we show that the rate can be made arbitrarily close to $1$.
CCMay 26, 2023
Can You Solve Closest String Faster than Exhaustive Search?Amir Abboud, Nick Fischer, Elazar Goldenberg et al.
We study the fundamental problem of finding the best string to represent a given set, in the form of the Closest String problem: Given a set $X \subseteq Σ^d$ of $n$ strings, find the string $x^*$ minimizing the radius of the smallest Hamming ball around $x^*$ that encloses all the strings in $X$. In this paper, we investigate whether the Closest String problem admits algorithms that are faster than the trivial exhaustive search algorithm. We obtain the following results for the two natural versions of the problem: $\bullet$ In the continuous Closest String problem, the goal is to find the solution string $x^*$ anywhere in $Σ^d$. For binary strings, the exhaustive search algorithm runs in time $O(2^d poly(nd))$ and we prove that it cannot be improved to time $O(2^{(1-ε) d} poly(nd))$, for any $ε> 0$, unless the Strong Exponential Time Hypothesis fails. $\bullet$ In the discrete Closest String problem, $x^*$ is required to be in the input set $X$. While this problem is clearly in polynomial time, its fine-grained complexity has been pinpointed to be quadratic time $n^{2 \pm o(1)}$ whenever the dimension is $ω(\log n) < d < n^{o(1)}$. We complement this known hardness result with new algorithms, proving essentially that whenever $d$ falls out of this hard range, the discrete Closest String problem can be solved faster than exhaustive search. In the small-$d$ regime, our algorithm is based on a novel application of the inclusion-exclusion principle. Interestingly, all of our results apply (and some are even stronger) to the natural dual of the Closest String problem, called the Remotest String problem, where the task is to find a string maximizing the Hamming distance to all the strings in $X$.