LGMay 29, 2014

Effect of Different Distance Measures on the Performance of K-Means Algorithm: An Experimental Study in Matlab

arXiv:1405.7471v1130 citations
Originality Synthesis-oriented
AI Analysis

This work provides practical guidance for selecting distance measures in K-means clustering, but it is incremental as it applies existing methods to standard datasets.

The study experimentally evaluated how different distance measures affect the performance of the K-means algorithm on iris and wine datasets in Matlab, finding that performance varies based on the data type and distance measure used.

K-means algorithm is a very popular clustering algorithm which is famous for its simplicity. Distance measure plays a very important rule on the performance of this algorithm. We have different distance measure techniques available. But choosing a proper technique for distance calculation is totally dependent on the type of the data that we are going to cluster. In this paper an experimental study is done in Matlab to cluster the iris and wine data sets with different distance measures and thereby observing the variation of the performances shown.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes