CVFeb 28, 2019

Efficient Parameter-free Clustering Using First Neighbor Relations

arXiv:1902.11266v1230 citations
Originality Incremental advance
AI Analysis

This provides a parameter-free clustering solution for large-scale practical problems across various domains.

The authors tackled the problem of clustering without requiring hyperparameters or specifying the number of clusters, by proposing a method based on first neighbor relations, achieving substantial performance gains on datasets ranging from 1,077 to 8.1 million samples.

We present a new clustering method in the form of a single clustering equation that is able to directly discover groupings in the data. The main proposition is that the first neighbor of each sample is all one needs to discover large chains and finding the groups in the data. In contrast to most existing clustering algorithms our method does not require any hyper-parameters, distance thresholds and/or the need to specify the number of clusters. The proposed algorithm belongs to the family of hierarchical agglomerative methods. The technique has a very low computational overhead, is easily scalable and applicable to large practical problems. Evaluation on well known datasets from different domains ranging between 1077 and 8.1 million samples shows substantial performance gains when compared to the existing clustering techniques.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes