On Parallel $k$-Center Clustering
For large-scale parallel clustering where k is large (≥ Ω(n^δ)), this provides a better approximation guarantee than prior work, addressing a scalability bottleneck in MPC algorithms.
The paper tackles the k-center clustering problem in the constant-dimensional Euclidean space under the low-local-space MPC model, achieving an O(log* n)-approximation with k(1+o(1)) clusters in O(log log n) rounds, improving over the previous O(log log log n) approximation.
We consider the classic $k$-center problem {in the constant dimensional Euclidean space} under a parallel setting, on the low-local-space Massively Parallel Computation (MPC) model, with local space per machine of ${O}(n^δ)$, where $δ\in (0,1)$ is an arbitrary constant. As a central clustering problem, the $k$-center problem has been studied extensively. Still, until very recently, all parallel MPC algorithms have been requiring $Ω(k)$ or even $Ω(k n^δ)$ local space per machine. While this setting covers the case of small values of $k$, for a large number of clusters these algorithms require large local memory, making them poorly scalable. The case of large $k$, $k \ge Ω(n^δ)$, has been considered recently for the low-local-space MPC model by Bateni et al.\ (2021), who gave an ${O}(\log \log n)$-round MPC algorithm that produces $k(1+o(1))$ centers whose cost has multiplicative approximation of ${O}(\log\log\log n)$. In this paper we extend the algorithm of Bateni et al. and design a low-local-space MPC algorithm that in ${O}(\log\log n)$ rounds returns a clustering with $k(1+o(1))$ clusters that is an ${O}(\log^*n)$-approximation for $k$-center.