LGNov 6, 2025

Sharp Minima Can Generalize: A Loss Landscape Perspective On Data

Raymond Fan, Bryce Sandlund, Lin Myat Ko

arXiv:2511.04808v11 citationsh-index: 5

Originality Incremental advance

AI Analysis

This work addresses the fundamental problem of understanding generalization in deep learning for researchers, revealing that data quantity shapes the loss landscape beyond flat minima.

The paper challenges the volume hypothesis by showing that sharp minima can generalize well but are rarely found due to their small volumes, and increasing training data expands these volumes, making them more accessible.

The volume hypothesis suggests deep learning is effective because it is likely to find flat minima due to their large volumes, and flat minima generalize well. This picture does not explain the role of large datasets in generalization. Measuring minima volumes under varying amounts of training data reveals sharp minima which generalize well exist, but are unlikely to be found due to their small volumes. Increasing data changes the loss landscape, such that previously small generalizing minima become (relatively) large.

View on arXiv PDF

Similar