Anne Driemel

CG
3papers
96citations
Novelty57%
AI Score42

3 Papers

DSMar 25
Near Linear Time Approximation Schemes for Clustering of Partially Doubling Metrics

Anne Driemel, Jan Höckendorff, Ioannis Psarros et al.

Given a finite metric space $(X\cup Y, \mathbf{d})$ the $k$-median problem is to find a set of $k$ centers $C\subseteq Y$ that minimizes $\sum_{p\in X} \min_{c\in C} \mathbf{d}(p,c)$. In general metrics, the best polynomial time algorithm computes a $(2+ε)$-approximation for arbitrary $ε>0$ (Cohen-Addad et al. STOC 2025). However, if the metric is doubling, a near linear time $(1+ε)$-approximation algorithm is known (Cohen-Addad et al. J. ACM 2021). We show that the $(1+ε)$-approximation algorithm can be generalized to the case when either $X$ or $Y$ has bounded doubling dimension (but the other set not). The case when $X$ is doubling is motivated by the assumption that even though $X$ is part of a high-dimensional space, it may be that it is close to a low-dimensional structure. The case when $Y$ is doubling is motivated by specific clustering problems where the centers are low-dimensional. Specifically, our work in this setting implies the first near linear time approximation algorithm for the $(k,\ell)$-median problem under discrete Fréchet distance when $\ell$ is constant. We further introduce a novel complexity reduction for time series of real values that leads to a similar result for the case of discrete Fréchet distance. In order to solve the case when $Y$ has a bounded doubling dimension, we introduce a dimension reduction that replaces points from $X$ by sets of points in $Y$. To solve the case when $X$ has a bounded doubling dimension, we generalize Talwar's decomposition (Talwar STOC 2004) to our setting. The running time of our algorithms is $2^{2^t} \tilde O(n+m)$ where $t=O(\mathrm{ddim} \log \frac{\mathrm{ddim}}ε)$ and where $\mathrm{ddim}$ is the doubling dimension of $X$ (resp.\ $Y$). The results also extend to the metric facility location problem.

CGMay 3, 2018
Approximating $(k,\ell)$-center clustering for curves

Kevin Buchin, Anne Driemel, Joachim Gudmundsson et al.

The Euclidean $k$-center problem is a classical problem that has been extensively studied in computer science. Given a set $\mathcal{G}$ of $n$ points in Euclidean space, the problem is to determine a set $\mathcal{C}$ of $k$ centers (not necessarily part of $\mathcal{G}$) such that the maximum distance between a point in $\mathcal{G}$ and its nearest neighbor in $\mathcal{C}$ is minimized. In this paper we study the corresponding $(k,\ell)$-center problem for polygonal curves under the Fréchet distance, that is, given a set $\mathcal{G}$ of $n$ polygonal curves in $\mathbb{R}^d$, each of complexity $m$, determine a set $\mathcal{C}$ of $k$ polygonal curves in $\mathbb{R}^d$, each of complexity $\ell$, such that the maximum Fréchet distance of a curve in $\mathcal{G}$ to its closest curve in $\mathcal{C}$ is minimized. In this paper, we substantially extend and improve the known approximation bounds for curves in dimension $2$ and higher. We show that, if $\ell$ is part of the input, then there is no polynomial-time approximation scheme unless $\mathsf{P}=\mathsf{NP}$. Our constructions yield different bounds for one and two-dimensional curves and the discrete and continuous Fréchet distance. In the case of the discrete Fréchet distance on two-dimensional curves, we show hardness of approximation within a factor close to $2.598$. This result also holds when $k=1$, and the $\mathsf{NP}$-hardness extends to the case that $\ell=\infty$, i.e., for the problem of computing the minimum-enclosing ball under the Fréchet distance. Finally, we observe that a careful adaptation of Gonzalez' algorithm in combination with a curve simplification yields a $3$-approximation in any dimension, provided that an optimal simplification can be computed exactly. We conclude that our approximation bounds are close to being tight.

CGMar 11, 2017
Locality-sensitive hashing of curves

Anne Driemel, Francesco Silvestri

We study data structures for storing a set of polygonal curves in ${\rm R}^d$ such that, given a query curve, we can efficiently retrieve similar curves from the set, where similarity is measured using the discrete Fréchet distance or the dynamic time warping distance. To this end we devise the first locality-sensitive hashing schemes for these distance measures. A major challenge is posed by the fact that these distance measures internally optimize the alignment between the curves. We give solutions for different types of alignments including constrained and unconstrained versions. For unconstrained alignments, we improve over a result by Indyk from 2002 for short curves. Let $n$ be the number of input curves and let $m$ be the maximum complexity of a curve in the input. In the particular case where $m \leq \fracα{4d} \log n$, for some fixed $α>0$, our solutions imply an approximate near-neighbor data structure for the discrete Fréchet distance that uses space in $O(n^{1+α}\log n)$ and achieves query time in $O(n^α\log^2 n)$ and constant approximation factor. Furthermore, our solutions provide a trade-off between approximation quality and computational performance: for any parameter $k \in [m]$, we can give a data structure that uses space in $O(2^{2k}m^{k-1} n \log n + nm)$, answers queries in $O( 2^{2k} m^{k}\log n)$ time and achieves approximation factor in $O(m/k)$.