LGNov 18, 2022

Understanding the double descent curve in Machine Learning

arXiv:2211.10322v13 citationsh-index: 17
Originality Incremental advance
AI Analysis

This addresses a foundational problem in machine learning theory for researchers and practitioners, providing insights into model selection beyond traditional bias-variance trade-offs.

The paper tackles the lack of fundamental theoretical understanding of the double descent curve in machine learning, where over-parameterized models perform well, and develops a principled explanation with experimental results that align with their hypothesis.

The theory of bias-variance used to serve as a guide for model selection when applying Machine Learning algorithms. However, modern practice has shown success with over-parameterized models that were expected to overfit but did not. This led to the proposal of the double descent curve of performance by Belkin et al. Although it seems to describe a real, representative phenomenon, the field is lacking a fundamental theoretical understanding of what is happening, what are the consequences for model selection and when is double descent expected to occur. In this paper we develop a principled understanding of the phenomenon, and sketch answers to these important questions. Furthermore, we report real experimental results that are correctly predicted by our proposed hypothesis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes