Feature Selection Using Classifier in High Dimensional Data
This work addresses feature selection for machine learning in bioinformatics, but it is incremental as it uses existing methods on new data.
The paper tackled the problem of feature selection in high-dimensional data by applying filter and wrapper approaches with QDA and LDA classifiers, finding that the filter method achieved better performance with a lower misclassification error rate.
Feature selection is frequently used as a pre-processing step to machine learning. It is a process of choosing a subset of original features so that the feature space is optimally reduced according to a certain evaluation criterion. The central objective of this paper is to reduce the dimension of the data by finding a small set of important features which can give good classification performance. We have applied filter and wrapper approach with different classifiers QDA and LDA respectively. A widely-used filter method is used for bioinformatics data i.e. a univariate criterion separately on each feature, assuming that there is no interaction between features and then applied Sequential Feature Selection method. Experimental results show that filter approach gives better performance in respect of Misclassification Error Rate.