LGSep 30, 2013

An Extensive Experimental Study on the Cluster-based Reference Set Reduction for speeding-up the k-NN Classifier

arXiv:1309.7750v2
Originality Synthesis-oriented
AI Analysis

This is an incremental study that addresses efficiency improvements for k-NN classifiers, which are widely used in various applications.

The paper tackles the high computational cost of k-NN classification by conducting an extensive experimental study on a cluster-based method for speeding it up, showing that careful parameter tuning on five real-life datasets can achieve better classification performance.

The k-Nearest Neighbor (k-NN) classification algorithm is one of the most widely-used lazy classifiers because of its simplicity and ease of implementation. It is considered to be an effective classifier and has many applications. However, its major drawback is that when sequential search is used to find the neighbors, it involves high computational cost. Speeding-up k-NN search is still an active research field. Hwang and Cho have recently proposed an adaptive cluster-based method for fast Nearest Neighbor searching. The effectiveness of this method is based on the adjustment of three parameters. However, the authors evaluated their method by setting specific parameter values and using only one dataset. In this paper, an extensive experimental study of this method is presented. The results, which are based on five real life datasets, illustrate that if the parameters of the method are carefully defined, one can achieve even better classification performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes