To Cluster, or Not to Cluster: An Analysis of Clusterability Methods
This work addresses the challenge for clustering users in selecting clusterability measures, but it is incremental as it compares existing methods without introducing new ones.
The paper tackles the problem of selecting appropriate clusterability measures by performing an extensive comparison of existing methods and providing guidelines for users to choose suitable ones for their applications.
Clustering is an essential data mining tool that aims to discover inherent cluster structure in data. For most applications, applying clustering is only appropriate when cluster structure is present. As such, the study of clusterability, which evaluates whether data possesses such structure, is an integral part of cluster analysis. However, methods for evaluating clusterability vary radically, making it challenging to select a suitable measure. In this paper, we perform an extensive comparison of measures of clusterability and provide guidelines that clustering users can reference to select suitable measures for their applications.