Performance Evaluation of Incremental K-means Clustering Algorithm
This is an incremental study for researchers in data mining, focusing on clustering efficiency in periodically updated environments.
The paper evaluates the incremental K-means clustering algorithm using an air pollution database, comparing it to existing K-means and identifying a threshold value where incremental K-means performs better.
The incremental K-means clustering algorithm has already been proposed and analysed in paper [Chakraborty and Nagwani, 2011]. It is a very innovative approach which is applicable in periodically incremental environment and dealing with a bulk of updates. In this paper the performance evaluation is done for this incremental K-means clustering algorithm using air pollution database. This paper also describes the comparison on the performance evaluations between existing K-means clustering and incremental K-means clustering using that particular database. It also evaluates that the particular point of change in the database upto which incremental K-means clustering performs much better than the existing K-means clustering. That particular point of change in the database is known as "Threshold value" or "% delta change in the database". This paper also defines the basic methodology for the incremental K-means clustering algorithm.