LGMLMay 1, 2020

Generalization Error of Generalized Linear Models in High Dimensions

arXiv:2005.00180v140 citations
Originality Incremental advance
AI Analysis

This addresses a foundational problem in machine learning for researchers by offering a theoretical framework, though it is incremental as it builds on existing work on generalization.

The paper tackles the problem of understanding generalization capabilities in over-parameterized models by providing a framework to characterize asymptotic generalization error for single-layer neural networks with arbitrary non-linearities, enabling analysis of factors like over-parameterization and loss functions, and rigorously explaining the double descent phenomenon.

At the heart of machine learning lies the question of generalizability of learned rules over previously unseen data. While over-parameterized models based on neural networks are now ubiquitous in machine learning applications, our understanding of their generalization capabilities is incomplete. This task is made harder by the non-convexity of the underlying learning problems. We provide a general framework to characterize the asymptotic generalization error for single-layer neural networks (i.e., generalized linear models) with arbitrary non-linearities, making it applicable to regression as well as classification problems. This framework enables analyzing the effect of (i) over-parameterization and non-linearity during modeling; and (ii) choices of loss function, initialization, and regularizer during learning. Our model also captures mismatch between training and test distributions. As examples, we analyze a few special cases, namely linear regression and logistic regression. We are also able to rigorously and analytically explain the \emph{double descent} phenomenon in generalized linear models.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes