MLLGJun 28, 2022

Studying Generalization Through Data Averaging

arXiv:2206.13669v1h-index: 8
Originality Incremental advance
AI Analysis

This work provides theoretical insights into generalization for researchers in machine learning, but it is incremental as it builds on existing analysis of data averaging and SGD effects.

The authors studied generalization by analyzing the average behavior of train and test performance across data samples, deriving expressions for the generalization gap and test performance based on data-averaged parameter distributions and loss. They showed that a modified generalization gap is non-negative for many parameter distributions and made predictions about how SGD noise affects generalization, which they tested empirically on Cifar10 with a ResNet.

The generalization of machine learning models has a complex dependence on the data, model and learning algorithm. We study train and test performance, as well as the generalization gap given by the mean of their difference over different data set samples to understand their ``typical" behavior. We derive an expression for the gap as a function of the covariance between the model parameter distribution and the train loss, and another expression for the average test performance, showing test generalization only depends on data-averaged parameter distribution and the data-averaged loss. We show that for a large class of model parameter distributions a modified generalization gap is always non-negative. By specializing further to parameter distributions produced by stochastic gradient descent (SGD), along with a few approximations and modeling considerations, we are able to predict some aspects about how the generalization gap and model train and test performance vary as a function of SGD noise. We evaluate these predictions empirically on the Cifar10 classification task based on a ResNet architecture.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes