LG MLJun 23, 2023

Precise Asymptotic Generalization for Multiclass Classification with Overparameterized Linear Models

Berkeley

arXiv:2306.13255v37.74 citationsh-index: 40

Originality Incremental advance

AI Analysis

This provides theoretical insights into generalization behavior in overparameterized settings for researchers in machine learning theory, though it is incremental on prior work.

The paper tackles the asymptotic generalization of overparameterized linear models for multiclass classification under a Gaussian bi-level model, fully resolving a prior conjecture and showing that the misclassification rate asymptotically goes to 0 or 1, with the min-norm interpolating classifier being suboptimal in certain regimes.

We study the asymptotic generalization of an overparameterized linear model for multiclass classification under the Gaussian covariates bi-level model introduced in Subramanian et al.~'22, where the number of data points, features, and classes all grow together. We fully resolve the conjecture posed in Subramanian et al.~'22, matching the predicted regimes for generalization. Furthermore, our new lower bounds are akin to an information-theoretic strong converse: they establish that the misclassification rate goes to 0 or 1 asymptotically. One surprising consequence of our tight results is that the min-norm interpolating classifier can be asymptotically suboptimal relative to noninterpolating classifiers in the regime where the min-norm interpolating regressor is known to be optimal. The key to our tight analysis is a new variant of the Hanson-Wright inequality which is broadly useful for multiclass problems with sparse labels. As an application, we show that the same type of analysis can be used to analyze the related multilabel classification problem under the same bi-level ensemble.

View on arXiv PDF

Similar