LGJun 19, 2015

A new Initial Centroid finding Method based on Dissimilarity Tree for K-means Algorithm

arXiv:1509.03200v12.15 citations

Originality Synthesis-oriented

AI Analysis

This addresses the issue of cluster quality and efficiency in data mining for users of the K-means algorithm, representing an incremental improvement.

The paper tackles the problem of K-means clustering's sensitivity to initial centroid selection by proposing a method based on a Dissimilarity Tree to find better initial centroids, resulting in more accurate clusters and reduced computational time, as supported by theory and experiments.

Cluster analysis is one of the primary data analysis technique in data mining and K-means is one of the commonly used partitioning clustering algorithm. In K-means algorithm, resulting set of clusters depend on the choice of initial centroids. If we can find initial centroids which are coherent with the arrangement of data, the better set of clusters can be obtained. This paper proposes a method based on the Dissimilarity Tree to find, the better initial centroid as well as every bit more accurate cluster with less computational time. Theory analysis and experimental results indicate that the proposed method can effectively improve the accuracy of clusters and reduce the computational complexity of the K-means algorithm.

View on arXiv PDF

Similar