CVFeb 28, 2019

Efficient Parameter-free Clustering Using First Neighbor Relations

M. Saquib Sarfraz, Vivek Sharma, Rainer Stiefelhagen

arXiv:1902.11266v124.6230 citationsHas Code

Originality Incremental advance

AI Analysis

This provides a parameter-free clustering solution for large-scale practical problems across various domains.

The authors tackled the problem of clustering without requiring hyperparameters or specifying the number of clusters, by proposing a method based on first neighbor relations, achieving substantial performance gains on datasets ranging from 1,077 to 8.1 million samples.

We present a new clustering method in the form of a single clustering equation that is able to directly discover groupings in the data. The main proposition is that the first neighbor of each sample is all one needs to discover large chains and finding the groups in the data. In contrast to most existing clustering algorithms our method does not require any hyper-parameters, distance thresholds and/or the need to specify the number of clusters. The proposed algorithm belongs to the family of hierarchical agglomerative methods. The technique has a very low computational overhead, is easily scalable and applicable to large practical problems. Evaluation on well known datasets from different domains ranging between 1077 and 8.1 million samples shows substantial performance gains when compared to the existing clustering techniques.

View on arXiv PDF Code

Similar