MLLGJul 15, 2023

The Interpolating Information Criterion for Overparameterized Models

arXiv:2307.07785v212 citationsh-index: 67
Originality Incremental advance
AI Analysis

This addresses model selection for researchers using overparameterized models in machine learning, offering a novel criterion but is incremental as it builds on classical information theory.

The paper tackles model selection for overparameterized models, where parameters exceed data size, by introducing the Interpolating Information Criterion, which incorporates prior choice and accounts for prior misspecification and model properties, showing numerical consistency with empirical and theoretical behavior.

The problem of model selection is considered for the setting of interpolating estimators, where the number of model parameters exceeds the size of the dataset. Classical information criteria typically consider the large-data limit, penalizing model size. However, these criteria are not appropriate in modern settings where overparameterized models tend to perform well. For any overparameterized model, we show that there exists a dual underparameterized model that possesses the same marginal likelihood, thus establishing a form of Bayesian duality. This enables more classical methods to be used in the overparameterized setting, revealing the Interpolating Information Criterion, a measure of model quality that naturally incorporates the choice of prior into the model selection. Our new information criterion accounts for prior misspecification, geometric and spectral properties of the model, and is numerically consistent with known empirical and theoretical behavior in this regime.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes