Spectral Bias in Practice: The Role of Function Frequency in Generalization
This work provides empirical evidence for spectral bias in real-world image classification, addressing a gap in understanding generalization in deep learning, though it is incremental as it builds on existing theoretical concepts.
The authors tackled the problem of verifying spectral bias in practical deep learning by measuring function frequencies in image classification networks on CIFAR-10 and ImageNet, finding that networks exhibit spectral bias and that interventions improving test accuracy lead to learned functions with higher overall frequencies but lower frequencies near class examples, with trends consistent across various training conditions.
Despite their ability to represent highly expressive functions, deep learning models seem to find simple solutions that generalize surprisingly well. Spectral bias -- the tendency of neural networks to prioritize learning low frequency functions -- is one possible explanation for this phenomenon, but so far spectral bias has primarily been observed in theoretical models and simplified experiments. In this work, we propose methodologies for measuring spectral bias in modern image classification networks on CIFAR-10 and ImageNet. We find that these networks indeed exhibit spectral bias, and that interventions that improve test accuracy on CIFAR-10 tend to produce learned functions that have higher frequencies overall but lower frequencies in the vicinity of examples from each class. This trend holds across variation in training time, model architecture, number of training examples, data augmentation, and self-distillation. We also explore the connections between function frequency and image frequency and find that spectral bias is sensitive to the low frequencies prevalent in natural images. On ImageNet, we find that learned function frequency also varies with internal class diversity, with higher frequencies on more diverse classes. Our work enables measuring and ultimately influencing the spectral behavior of neural networks used for image classification, and is a step towards understanding why deep models generalize well.