LGAIMLAug 3, 2018

Generalization Error in Deep Learning

arXiv:1808.01174v3131 citations
Originality Synthesis-oriented
AI Analysis

This addresses the fundamental question of why deep learning models generalize well, which is crucial for researchers and practitioners in machine learning, but it is incremental as it synthesizes existing work rather than introducing new findings.

The paper tackles the problem of understanding the generalization ability of deep neural networks by providing an overview of existing theories and bounds for generalization error, combining classical and recent theoretical and empirical results.

Deep learning models have lately shown great performance in various fields such as computer vision, speech recognition, speech translation, and natural language processing. However, alongside their state-of-the-art performance, it is still generally unclear what is the source of their generalization ability. Thus, an important question is what makes deep neural networks able to generalize well from the training set to new data. In this article, we provide an overview of the existing theory and bounds for the characterization of the generalization error of deep neural networks, combining both classical and more recent theoretical and empirical results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes