QM CE LG MLFeb 7, 2013

Feature Selection for Microarray Gene Expression Data using Simulated Annealing guided by the Multivariate Joint Entropy

arXiv:1302.1733v127 citations

Originality Incremental advance

AI Analysis

This work addresses gene selection for microarray data analysis, which is incremental as it builds on prior TAFS algorithms with a new entropy calculation and optimization method.

The authors tackled feature selection for microarray gene expression data by developing a new multivariate joint entropy measure and the mu-TAFS algorithm using simulated annealing, resulting in high classification performance and small, biologically meaningful gene subsets.

In this work a new way to calculate the multivariate joint entropy is presented. This measure is the basis for a fast information-theoretic based evaluation of gene relevance in a Microarray Gene Expression data context. Its low complexity is based on the reuse of previous computations to calculate current feature relevance. The mu-TAFS algorithm --named as such to differentiate it from previous TAFS algorithms-- implements a simulated annealing technique specially designed for feature subset selection. The algorithm is applied to the maximization of gene subset relevance in several public-domain microarray data sets. The experimental results show a notoriously high classification performance and low size subsets formed by biologically meaningful genes.

View on arXiv PDF

Similar