LGDec 11, 2022

Generalization Through the Lens of Learning Dynamics

arXiv:2212.05377v11.8h-index: 17

Originality Synthesis-oriented

AI Analysis

It addresses the problem of understanding and improving generalization for developers of reliable machine learning systems, though it appears incremental as it builds on existing theoretical frameworks.

The thesis investigates generalization in deep neural networks by analyzing learning dynamics in supervised and reinforcement learning tasks, aiming to explain their strong performance despite theoretical challenges.

A machine learning (ML) system must learn not only to match the output of a target function on a training set, but also to generalize to novel situations in order to yield accurate predictions at deployment. In most practical applications, the user cannot exhaustively enumerate every possible input to the model; strong generalization performance is therefore crucial to the development of ML systems which are performant and reliable enough to be deployed in the real world. While generalization is well-understood theoretically in a number of hypothesis classes, the impressive generalization performance of deep neural networks has stymied theoreticians. In deep reinforcement learning (RL), our understanding of generalization is further complicated by the conflict between generalization and stability in widely-used RL algorithms. This thesis will provide insight into generalization by studying the learning dynamics of deep neural networks in both supervised and reinforcement learning tasks.

View on arXiv PDF

Similar