Prediction Errors for Penalized Regressions based on Generalized Approximate Message Passing
This work provides theoretical insights into prediction error estimation for statisticians and machine learning practitioners, but it is incremental as it builds on existing GAMP and replica frameworks.
The paper analyzes prediction errors for penalized regressions in generalized linear models, deriving estimators like Cp, information criteria, and LOOCV using GAMP and replica methods, and shows discrepancies when parameters exceed data dimensions.
We discuss the prediction accuracy of assumed statistical models in terms of prediction errors for the generalized linear model and penalized maximum likelihood methods. We derive the forms of estimators for the prediction errors, such as $C_p$ criterion, information criteria, and leave-one-out cross validation (LOOCV) error, using the generalized approximate message passing (GAMP) algorithm and replica method. These estimators coincide with each other when the number of model parameters is sufficiently small; however, there is a discrepancy between them in particular in the parameter region where the number of model parameters is larger than the data dimension. In this paper, we review the prediction errors and corresponding estimators, and discuss their differences. In the framework of GAMP, we show that the information criteria can be expressed by using the variance of the estimates. Further, we demonstrate how to approach LOOCV error from the information criteria by utilizing the expression provided by GAMP.