Exploiting the Accumulated Evidence for Gene Selection in Microarray Gene Expression Data
This incremental method addresses feature selection challenges in high-dimensional, low-sample cancer classification tasks.
The paper tackles the problem of gene selection in microarray gene expression data for cancer classification by accumulating evidence for or against genes during the search process, resulting in subsets with improved predictive accuracy or reduced gene size.
Machine Learning methods have of late made significant efforts to solving multidisciplinary problems in the field of cancer classification using microarray gene expression data. Feature subset selection methods can play an important role in the modeling process, since these tasks are characterized by a large number of features and a few observations, making the modeling a non-trivial undertaking. In this particular scenario, it is extremely important to select genes by taking into account the possible interactions with other gene subsets. This paper shows that, by accumulating the evidence in favour (or against) each gene along the search process, the obtained gene subsets may constitute better solutions, either in terms of predictive accuracy or gene size, or in both. The proposed technique is extremely simple and applicable at a negligible overhead in cost.