LGSep 5, 2012

Improving the K-means algorithm using improved downhill simplex search

Ehsan Saboori, Shafigh Parsazad, Anoosheh Sadeghi

arXiv:1209.0853v14 citations

Originality Incremental advance

AI Analysis

This addresses the issue of poor clustering quality due to local optima in k-means for data analysis applications, but it is incremental as it builds on existing optimization methods.

The paper tackled the problem of k-means converging to local optima by proposing an improved downhill simplex search to find optimal initial partitions, resulting in comparisons showing it outperforms GA, GKM, IGKM, and standard k-means algorithms.

The k-means algorithm is one of the well-known and most popular clustering algorithms. K-means seeks an optimal partition of the data by minimizing the sum of squared error with an iterative optimization procedure, which belongs to the category of hill climbing algorithms. As we know hill climbing searches are famous for converging to local optimums. Since k-means can converge to a local optimum, different initial points generally lead to different convergence cancroids, which makes it important to start with a reasonable initial partition in order to achieve high quality clustering solutions. However, in theory, there exist no efficient and universal methods for determining such initial partitions. In this paper we tried to find an optimum initial partitioning for k-means algorithm. To achieve this goal we proposed a new improved version of downhill simplex search, and then we used it in order to find an optimal result for clustering approach and then compare this algorithm with Genetic Algorithm base (GA), Genetic K-Means (GKM), Improved Genetic K-Means (IGKM) and k-means algorithms.

View on arXiv PDF

Similar