LG CEJul 4, 2013

Examining the Classification Accuracy of TSVMs with ?Feature Selection in Comparison with the GLAD Algorithm

Hala Helmi, Jon M. Garibaldi, Uwe Aickelin

arXiv:1307.1387v11 citations

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of expensive labeled data acquisition in gene expression analysis for medical diagnostics, but appears to be an incremental improvement over existing semi-supervised methods.

The paper tackled the problem of classifying gene expression data with limited labeled samples by proposing a TSVM-RFE method that combines transductive support vector machines with recursive feature elimination. The result showed that TSVM-RFE surpassed both SVM-RFE and the GLAD algorithm in classification accuracy, though no specific numerical improvements were provided.

Gene expression data sets are used to classify and predict patient diagnostic categories. As we know, it is extremely difficult and expensive to obtain gene expression labelled examples. Moreover, conventional supervised approaches cannot function properly when labelled data (training examples) are insufficient using Support Vector Machines (SVM) algorithms. Therefore, in this paper, we suggest Transductive Support Vector Machines (TSVMs) as semi-supervised learning algorithms, learning with both labelled samples data and unlabelled samples to perform the classification of microarray data. To prune the superfluous genes and samples we used a feature selection method called Recursive Feature Elimination (RFE), which is supposed to enhance the output of classification and avoid the local optimization problem. We examined the classification prediction accuracy of the TSVM-RFE algorithm in comparison with the Genetic Learning Across Datasets (GLAD) algorithm, as both are semi-supervised learning methods. Comparing these two methods, we found that the TSVM-RFE surpassed both a SVM using RFE and GLAD.

View on arXiv PDF

Similar