MLFeb 3, 2016

High-Dimensional Regularized Discriminant Analysis

arXiv:1602.01182v28 citations
AI Analysis

This addresses the challenge of applying discriminant analysis in high-dimensional, small-sample settings, offering a more efficient and interpretable solution for data analysis tasks.

The paper tackles the problem of regularized discriminant analysis (RDA) being impractical and uninterpretable for high-dimensional data by proposing HDRDA, which improves computational efficiency and interpretability. Results show HDRDA often outperforms other classifiers in accuracy on high-dimensional datasets and drastically reduces runtime compared to standard RDA.

Regularized discriminant analysis (RDA), proposed by Friedman (1989), is a widely popular classifier that lacks interpretability and is impractical for high-dimensional data sets. Here, we present an interpretable and computationally efficient classifier called high-dimensional RDA (HDRDA), designed for the small-sample, high-dimensional setting. For HDRDA, we show that each training observation, regardless of class, contributes to the class covariance matrix, resulting in an interpretable estimator that borrows from the pooled sample covariance matrix. Moreover, we show that HDRDA is equivalent to a classifier in a reduced-feature space with dimension approximately equal to the training sample size. As a result, the matrix operations employed by HDRDA are computationally linear in the number of features, making the classifier well-suited for high-dimensional classification in practice. We demonstrate that HDRDA is often superior to several sparse and regularized classifiers in terms of classification accuracy with three artificial and six real high-dimensional data sets. Also, timing comparisons between our HDRDA implementation in the sparsediscrim R package and the standard RDA formulation in the klaR R package demonstrate that as the number of features increases, the computational runtime of HDRDA is drastically smaller than that of RDA.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes