On the Spectral Bias of Convolutional Neural Tangent and Gaussian Process Kernels
This provides a concrete quantitative characterization of convolutional network architectures, which is incremental but useful for researchers in deep learning theory.
The paper tackled the problem of understanding the spectral properties of over-parametrized convolutional neural networks by analyzing their Gaussian process and neural tangent kernels, proving that eigenfunctions are products of spherical harmonics and bounding eigenvalues to show polynomial decay with quantified rates.
We study the properties of various over-parametrized convolutional neural architectures through their respective Gaussian process and neural tangent kernels. We prove that, with normalized multi-channel input and ReLU activation, the eigenfunctions of these kernels with the uniform measure are formed by products of spherical harmonics, defined over the channels of the different pixels. We next use hierarchical factorizable kernels to bound their respective eigenvalues. We show that the eigenvalues decay polynomially, quantify the rate of decay, and derive measures that reflect the composition of hierarchical features in these networks. Our results provide concrete quantitative characterization of over-parameterized convolutional network architectures.