LGMLNov 6, 2021

Understanding Layer-wise Contributions in Deep Neural Networks through Spectral Analysis

arXiv:2111.03972v36 citations
Originality Incremental advance
AI Analysis

This work provides insights into layer-wise contributions in neural networks, which is incremental for researchers in deep learning theory.

The paper tackled the problem of understanding how different layers in deep neural networks contribute to generalization error reduction by analyzing their spectral bias, and it proved that initial layers have a larger bias towards high-frequency functions on the unit sphere, with empirical validation on high-dimensional datasets.

Spectral analysis is a powerful tool, decomposing any function into simpler parts. In machine learning, Mercer's theorem generalizes this idea, providing for any kernel and input distribution a natural basis of functions of increasing frequency. More recently, several works have extended this analysis to deep neural networks through the framework of Neural Tangent Kernel. In this work, we analyze the layer-wise spectral bias of Deep Neural Networks and relate it to the contributions of different layers in the reduction of generalization error for a given target function. We utilize the properties of Hermite polynomials and Spherical Harmonics to prove that initial layers exhibit a larger bias towards high-frequency functions defined on the unit sphere. We further provide empirical results validating our theory in high dimensional datasets for Deep Neural Networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes