LG AI IT MLOct 12, 2020

An Information-Theoretic Perspective on Overfitting and Underfitting

Daniel Bashir, George D. Montanez, Sonia Sehra, Pedro Sandoval Segura, Julius Lauw

arXiv:2010.06076v25.056 citations

Originality Highly original

AI Analysis

This work addresses a foundational problem in machine learning theory for researchers and practitioners, offering a theoretical perspective on model behavior.

The authors tackled the problem of understanding overfitting and underfitting in machine learning by developing an information-theoretic framework, proving the formal undecidability of determining whether an arbitrary classification algorithm will overfit a dataset and providing upper bounds on algorithm capacity.

We present an information-theoretic framework for understanding overfitting and underfitting in machine learning and prove the formal undecidability of determining whether an arbitrary classification algorithm will overfit a dataset. Measuring algorithm capacity via the information transferred from datasets to models, we consider mismatches between algorithm capacities and datasets to provide a signature for when a model can overfit or underfit a dataset. We present results upper-bounding algorithm capacity, establish its relationship to quantities in the algorithmic search framework for machine learning, and relate our work to recent information-theoretic approaches to generalization.

View on arXiv PDF

Similar