MESTMLJun 16, 2021

Pre-processing with Orthogonal Decompositions for High-dimensional Explanatory Variables

arXiv:2106.09071v1
Originality Incremental advance
AI Analysis

This addresses a specific issue in high-dimensional statistics for researchers and practitioners, but it is incremental as it builds on existing penalized regression methods.

The paper tackles the problem of false variable inclusion in high-dimensional LASSO regression due to correlated explanatory variables by proposing a pre-processing method called PROD, which improves performance as shown in simulations and data analysis.

Strong correlations between explanatory variables are problematic for high-dimensional regularized regression methods. Due to the violation of the Irrepresentable Condition, the popular LASSO method may suffer from false inclusions of inactive variables. In this paper, we propose pre-processing with orthogonal decompositions (PROD) for the explanatory variables in high-dimensional regressions. The PROD procedure is constructed based upon a generic orthogonal decomposition of the design matrix. We demonstrate by two concrete cases that the PROD approach can be effectively constructed for improving the performance of high-dimensional penalized regression. Our theoretical analysis reveals their properties and benefits for high-dimensional penalized linear regression with LASSO. Extensive numerical studies with simulations and data analysis show the promising performance of the PROD.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes