A study on tuning parameter selection for the high-dimensional lasso
This work addresses a key challenge in high-dimensional statistics for researchers and practitioners, though it is incremental as it builds on prior variance estimation literature.
The authors tackled the problem of selecting tuning parameters for the lasso in high-dimensional regression, where existing methods perform poorly, and developed new information criteria that perform well across simulations.
High-dimensional predictive models, those with more measurements than observations, require regularization to be well defined, perform well empirically, and possess theoretical guarantees. The amount of regularization, often determined by tuning parameters, is integral to achieving good performance. One can choose the tuning parameter in a variety of ways, such as through resampling methods or generalized information criteria. However, the theory supporting many regularized procedures relies on an estimate for the variance parameter, which is complicated in high dimensions. We develop a suite of information criteria for choosing the tuning parameter in lasso regression by leveraging the literature on high-dimensional variance estimation. We derive intuition showing that existing information-theoretic approaches work poorly in this setting. We compare our risk estimators to existing methods with an extensive simulation and derive some theoretical justification. We find that our new estimators perform well across a wide range of simulation conditions and evaluation criteria.