Effect of Different Distance Measures on the Performance of K-Means Algorithm: An Experimental Study in Matlab
This work provides practical guidance for selecting distance measures in K-means clustering, but it is incremental as it applies existing methods to standard datasets.
The study experimentally evaluated how different distance measures affect the performance of the K-means algorithm on iris and wine datasets in Matlab, finding that performance varies based on the data type and distance measure used.
K-means algorithm is a very popular clustering algorithm which is famous for its simplicity. Distance measure plays a very important rule on the performance of this algorithm. We have different distance measure techniques available. But choosing a proper technique for distance calculation is totally dependent on the type of the data that we are going to cluster. In this paper an experimental study is done in Matlab to cluster the iris and wine data sets with different distance measures and thereby observing the variation of the performances shown.