LGMLMar 3, 2025

Deep Learning is Not So Mysterious or Different

arXiv:2503.02113v230.640 citationsh-index: 41ICML
Originality Synthesis-oriented
AI Analysis

This challenges common perceptions in the ML community by demystifying deep learning's generalization, though it acknowledges deep learning's distinct aspects like representation learning.

The paper argues that deep learning's anomalous generalization behaviors, such as benign overfitting and double descent, are not unique or mysterious and can be explained using established frameworks like PAC-Bayes, with soft inductive biases as a unifying principle.

Deep neural networks are often seen as different from other model classes by defying conventional notions of generalization. Popular examples of anomalous generalization behaviour include benign overfitting, double descent, and the success of overparametrization. We argue that these phenomena are not distinct to neural networks, or particularly mysterious. Moreover, this generalization behaviour can be intuitively understood, and rigorously characterized, using long-standing generalization frameworks such as PAC-Bayes and countable hypothesis bounds. We present soft inductive biases as a key unifying principle in explaining these phenomena: rather than restricting the hypothesis space to avoid overfitting, embrace a flexible hypothesis space, with a soft preference for simpler solutions that are consistent with the data. This principle can be encoded in many model classes, and thus deep learning is not as mysterious or different from other model classes as it might seem. However, we also highlight how deep learning is relatively distinct in other ways, such as its ability for representation learning, phenomena such as mode connectivity, and its relative universality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes