Scalable Varied-Density Clustering via Graph Propagation
This addresses the challenge of efficient clustering in large-scale, high-dimensional datasets for data scientists and machine learning practitioners, though it appears incremental as it builds on existing graph propagation techniques.
The paper tackles the problem of varied-density clustering for high-dimensional data by framing it as a label propagation process in adaptive neighborhood graphs, resulting in a method that scales to millions of points in minutes with competitive accuracy.
We propose a novel perspective on varied-density clustering for high-dimensional data by framing it as a label propagation process in neighborhood graphs that adapt to local density variations. Our method formally connects density-based clustering with graph connectivity, enabling the use of efficient graph propagation techniques developed in network science. To ensure scalability, we introduce a density-aware neighborhood propagation algorithm and leverage advanced random projection methods to construct approximate neighborhood graphs. Our approach significantly reduces computational cost while preserving clustering quality. Empirically, it scales to datasets with millions of points in minutes and achieves competitive accuracy compared to existing baselines.