DS LGDec 17, 2025

Label-consistent clustering for evolving data

Ameet Gadekar, Aristides Gionis, Thibault Marette

arXiv:2512.15210v12 citationsh-index: 11

Originality Incremental advance

AI Analysis

This addresses the need for stable and consistent clustering updates in iterative data analysis processes, though it is incremental as it builds on existing k-center methods.

The paper tackles the problem of updating clustering solutions for evolving data while minimizing changes from prior solutions, proposing two constant-factor approximation algorithms for the label-consistent k-center problem and demonstrating their effectiveness on real-world datasets.

Data analysis often involves an iterative process, where solutions must be continuously refined in response to new data. Typically, as new data becomes available, an existing solution must be updated to incorporate the latest information. In addition to seeking a high-quality solution for the task at hand, it is also crucial to ensure consistency by minimizing drastic changes from previous solutions. Applying this approach across many iterations, ensures that the solution evolves gradually and smoothly. In this paper, we study the above problem in the context of clustering, specifically focusing on the $k$-center problem. More precisely, we study the following problem: Given a set of points $X$, parameters $k$ and $b$, and a prior clustering solution $H$ for $X$, our goal is to compute a new solution $C$ for $X$, consisting of $k$ centers, which minimizes the clustering cost while introducing at most $b$ changes from $H$. We refer to this problem as label-consistent $k$-center, and we propose two constant-factor approximation algorithms for it. We complement our theoretical findings with an experimental evaluation demonstrating the effectiveness of our methods on real-world datasets.

View on arXiv PDF

Similar