ML LGJul 15, 2023

The Interpolating Information Criterion for Overparameterized Models

Liam Hodgkinson, Chris van der Heide, Robert Salomone, Fred Roosta, Michael W. Mahoney

arXiv:2307.07785v216.812 citationsh-index: 67

Originality Incremental advance

AI Analysis

This addresses model selection for researchers using overparameterized models in machine learning, offering a novel criterion but is incremental as it builds on classical information theory.

The paper tackles model selection for overparameterized models, where parameters exceed data size, by introducing the Interpolating Information Criterion, which incorporates prior choice and accounts for prior misspecification and model properties, showing numerical consistency with empirical and theoretical behavior.

The problem of model selection is considered for the setting of interpolating estimators, where the number of model parameters exceeds the size of the dataset. Classical information criteria typically consider the large-data limit, penalizing model size. However, these criteria are not appropriate in modern settings where overparameterized models tend to perform well. For any overparameterized model, we show that there exists a dual underparameterized model that possesses the same marginal likelihood, thus establishing a form of Bayesian duality. This enables more classical methods to be used in the overparameterized setting, revealing the Interpolating Information Criterion, a measure of model quality that naturally incorporates the choice of prior into the model selection. Our new information criterion accounts for prior misspecification, geometric and spectral properties of the model, and is numerically consistent with known empirical and theoretical behavior in this regime.

View on arXiv PDF

Similar