Feature selection using nearest attributes
This addresses the problem of improving classification accuracy in high-dimensional data analysis for domains like image processing and bioinformatics, though it appears incremental as it builds on existing feature selection concepts.
The paper tackles feature selection in high-dimensional classification by introducing a method that selects features based on their discriminatory ability among classes, using the area of overlap between inter-class and intra-class distances, and reports state-of-the-art recognition results with benchmark databases for images and microarray data.
Feature selection is an important problem in high-dimensional data analysis and classification. Conventional feature selection approaches focus on detecting the features based on a redundancy criterion using learning and feature searching schemes. In contrast, we present an approach that identifies the need to select features based on their discriminatory ability among classes. Area of overlap between inter-class and intra-class distances resulting from feature to feature comparison of an attribute is used as a measure of discriminatory ability of the feature. A set of nearest attributes in a pattern having the lowest area of overlap within a degree of tolerance defined by a selection threshold is selected to represent the best available discriminable features. State of the art recognition results are reported for pattern classification problems by using the proposed feature selection scheme with the nearest neighbour classifier. These results are reported with benchmark databases having high dimensional feature vectors in the problems involving images and micro array data.