LG MLJan 28, 2024

Prevalidated ridge regression is a highly-efficient drop-in replacement for logistic regression for high-dimensional data

Angus Dempster, Geoffrey I. Webb, Daniel F. Schmidt

arXiv:2401.15610v22.6h-index: 7Has Code

Originality Incremental advance

AI Analysis

This provides a more efficient alternative to logistic regression for practitioners dealing with high-dimensional classification tasks, though it is incremental as it builds on existing ridge regression methods.

The authors tackled the computational inefficiency and hyperparameter tuning challenges of logistic regression for high-dimensional data by introducing a prevalidated ridge regression model, which achieved comparable classification error and log-loss with significantly reduced computational cost and minimal hyperparameters.

Logistic regression is a ubiquitous method for probabilistic classification. However, the effectiveness of logistic regression depends upon careful and relatively computationally expensive tuning, especially for the regularisation hyperparameter, and especially in the context of high-dimensional data. We present a prevalidated ridge regression model that closely matches logistic regression in terms of classification error and log-loss, particularly for high-dimensional data, while being significantly more computationally efficient and having effectively no hyperparameters beyond regularisation. We scale the coefficients of the model so as to minimise log-loss for a set of prevalidated predictions derived from the estimated leave-one-out cross-validation error. This exploits quantities already computed in the course of fitting the ridge regression model in order to find the scaling parameter with nominal additional computational expense.

View on arXiv PDF Code

Similar