STMLOct 6, 2019

Ridge Regression: Structure, Cross-Validation, and Sketching

arXiv:1910.02373v352 citations
Originality Incremental advance
AI Analysis

This work addresses methodological improvements for ridge regression, which is widely used in statistics and machine learning for regularization, but it is incremental as it builds on existing theory with specific refinements.

The paper tackles three fundamental problems in ridge regression: understanding the estimator's structure, correcting bias in cross-validation for regularization parameter selection, and analyzing the accuracy of sketching methods for computational acceleration. It provides precise theoretical results showing that sketching methods are surprisingly accurate and proposes a simple bias-correction for cross-validation.

We study the following three fundamental problems about ridge regression: (1) what is the structure of the estimator? (2) how to correctly use cross-validation to choose the regularization parameter? and (3) how to accelerate computation without losing too much accuracy? We consider the three problems in a unified large-data linear model. We give a precise representation of ridge regression as a covariance matrix-dependent linear combination of the true parameter and the noise. We study the bias of $K$-fold cross-validation for choosing the regularization parameter, and propose a simple bias-correction. We analyze the accuracy of primal and dual sketching for ridge regression, showing they are surprisingly accurate. Our results are illustrated by simulations and by analyzing empirical data.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes