EVINGCA: Adaptive Graph Clustering with Evolving Neighborhood Statistics
This work addresses clustering limitations for data scientists dealing with complex datasets, though it appears incremental as it builds on existing graph-based methods.
The authors tackled the problem of clustering algorithms relying on restrictive assumptions by developing EVINGCA, an adaptive graph clustering algorithm that uses evolving neighborhood statistics to guide cluster formation, achieving competitive performance across synthetic and real-world datasets.
Clustering algorithms often rely on restrictive assumptions: K-Means and Gaussian Mixtures presuppose convex, Gaussian-like clusters, while DBSCAN and HDBSCAN capture non-convexity but can be highly sensitive. I introduce EVINGCA (Evolving Variance-Informed Nonparametric Graph Construction Algorithm), a density-variance based clustering algorithm that treats cluster formation as an adaptive, evolving process on a nearest-neighbor graph. EVINGCA expands rooted graphs via breadth-first search, guided by continuously updated local distance and shape statistics, replacing fixed density thresholds with local statistical feedback. With spatial indexing, EVINGCA features log-linear complexity in the average case and exhibits competitive performance against baselines across a variety of synthetic, real-world, low-d, and high-d datasets.