ML LGSep 9, 2021

Supervised Linear Dimension-Reduction Methods: Review, Extensions, and Comparisons

Shaojie Xu, Joel Vaughan, Jie Chen, Agus Sudjianto, Vijayan Nair

arXiv:2109.04244v13.64 citations

Originality Synthesis-oriented

AI Analysis

This work provides a comparative analysis for researchers and practitioners in data analysis, but it is incremental as it reviews and extends existing methods rather than introducing new ones.

The paper reviews and compares supervised linear dimension-reduction methods, addressing the limitation of unsupervised PCA in predictive performance by incorporating response information, and finds that partial least squares (PLS) and least-squares PCA (LSPCA) consistently outperform other techniques in simulations.

Principal component analysis (PCA) is a well-known linear dimension-reduction method that has been widely used in data analysis and modeling. It is an unsupervised learning technique that identifies a suitable linear subspace for the input variable that contains maximal variation and preserves as much information as possible. PCA has also been used in prediction models where the original, high-dimensional space of predictors is reduced to a smaller, more manageable, set before conducting regression analysis. However, this approach does not incorporate information in the response during the dimension-reduction stage and hence can have poor predictive performance. To address this concern, several supervised linear dimension-reduction techniques have been proposed in the literature. This paper reviews selected techniques, extends some of them, and compares their performance through simulations. Two of these techniques, partial least squares (PLS) and least-squares PCA (LSPCA), consistently outperform the others in this study.

View on arXiv PDF

Similar