ML LGMay 9, 2020

A Compressive Classification Framework for High-Dimensional Data

arXiv:2005.04383v21.4

Originality Incremental advance

AI Analysis

This is an incremental improvement for researchers and practitioners dealing with high-dimensional data classification, offering a method that combines feature selection with regularization.

The paper tackles the problem of classification in high-dimensional data where features exceed samples, proposing compressive regularized discriminant analysis (CRDA) that reduces misclassification errors and achieves accurate feature selection, as demonstrated on real datasets like image, speech, and gene expression data.

We propose a compressive classification framework for settings where the data dimensionality is significantly higher than the sample size. The proposed method, referred to as compressive regularized discriminant analysis (CRDA) is based on linear discriminant analysis and has the ability to select significant features by using joint-sparsity promoting hard thresholding in the discriminant rule. Since the number of features is larger than the sample size, the method also uses state-of-the-art regularized sample covariance matrix estimators. Several analysis examples on real data sets, including image, speech signal and gene expression data illustrate the promising improvements offered by the proposed CRDA classifier in practise. Overall, the proposed method gives fewer misclassification errors than its competitors, while at the same time achieving accurate feature selection results. The open-source R package and MATLAB toolbox of the proposed method (named compressiveRDA) is freely available.

View on arXiv PDF

Similar