LGDIS-NNAIMar 24, 2023

Online Learning for the Random Feature Model in the Student-Teacher Framework

arXiv:2303.14083v2h-index: 28
Originality Synthesis-oriented
AI Analysis

This work addresses the theoretical understanding of over-parametrization for researchers in machine learning, but it is incremental as it builds on existing frameworks without introducing new methods.

The paper tackles the problem of over-parametrization in neural networks by analyzing a random feature model in a student-teacher framework, finding that perfect generalization is impossible unless the student's hidden layer size is exponentially larger than the input dimension, with a non-zero asymptotic generalization error computed for finite ratios.

Deep neural networks are widely used prediction algorithms whose performance often improves as the number of weights increases, leading to over-parametrization. We consider a two-layered neural network whose first layer is frozen while the last layer is trainable, known as the random feature model. We study over-parametrization in the context of a student-teacher framework by deriving a set of differential equations for the learning dynamics. For any finite ratio of hidden layer size and input dimension, the student cannot generalize perfectly, and we compute the non-zero asymptotic generalization error. Only when the student's hidden layer size is exponentially larger than the input dimension, an approach to perfect generalization is possible.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes