Regularized Partial Least Squares with an Application to NMR Spectroscopy
This work provides an incremental improvement for researchers in fields like genomics and chemometrics dealing with high-dimensional data analysis.
The authors tackled the problem of analyzing high-dimensional data with complex correlations by introducing a Regularized Partial Least Squares (PLS) framework, which demonstrated flexibility, fast computation, and utility in simulations and a case study on NMR spectroscopy data.
High-dimensional data common in genomics, proteomics, and chemometrics often contains complicated correlation structures. Recently, partial least squares (PLS) and Sparse PLS methods have gained attention in these areas as dimension reduction techniques in the context of supervised data analysis. We introduce a framework for Regularized PLS by solving a relaxation of the SIMPLS optimization problem with penalties on the PLS loadings vectors. Our approach enjoys many advantages including flexibility, general penalties, easy interpretation of results, and fast computation in high-dimensional settings. We also outline extensions of our methods leading to novel methods for Non-negative PLS and Generalized PLS, an adaption of PLS for structured data. We demonstrate the utility of our methods through simulations and a case study on proton Nuclear Magnetic Resonance (NMR) spectroscopy data.