Quantile universal threshold: model selection at the detection edge for high-dimensional linear regression
This addresses a crucial practical issue for researchers and practitioners using lasso and similar methods in high-dimensional data analysis, though it is incremental as it builds on existing thresholding estimators.
The paper tackles the problem of selecting the threshold parameter λ for sparse linear model estimation in high-dimensional regression, proposing Quantile Universal Thresholding to achieve a high true positive rate and low false discovery rate, with results validated through simulations and real data.
To estimate a sparse linear model from data with Gaussian noise, consilience from lasso and compressed sensing literatures is that thresholding estimators like lasso and the Dantzig selector have the ability in some situations to identify with high probability part of the significant covariates asymptotically, and are numerically tractable thanks to convexity. Yet, the selection of a threshold parameter $λ$ remains crucial in practice. To that aim we propose Quantile Universal Thresholding, a selection of $λ$ at the detection edge. We show with extensive simulations and real data that an excellent compromise between high true positive rate and low false discovery rate is achieved, leading also to good predictive risk.