Distance-based classifier by data transformation for high-dimension, strongly spiked eigenvalue models
This addresses classification challenges in high-dimensional domains like genomics, but appears incremental as it builds on existing noise reduction and transformation methods.
The paper tackles classification in high-dimensional data with strongly spiked eigenvalue models by developing a new distance-based classifier that transforms data to non-SSE models, showing improved performance in simulations and microarray datasets.
We consider classifiers for high-dimensional data under the strongly spiked eigenvalue (SSE) model. We first show that high-dimensional data often have the SSE model. We consider a distance-based classifier using eigenstructures for the SSE model. We apply the noise reduction methodology to estimation of the eigenvalues and eigenvectors in the SSE model. We create a new distance-based classifier by transforming data from the SSE model to the non-SSE model. We give simulation studies and discuss the performance of the new classifier. Finally, we demonstrate the new classifier by using microarray data sets.