Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent
This work provides theoretical insights into the convergence behavior of over-parametrized CNNs, which is incremental for researchers in deep learning theory.
The paper tackles the problem of analyzing the convergence rate of an over-parametrized convolutional neural network trained with gradient descent for image classification, deriving a bound on the difference between the misclassification risk of the estimate and the minimal possible value.
Image classification based on over-parametrized convolutional neural networks with a global average-pooling layer is considered. The weights of the network are learned by gradient descent. A bound on the rate of convergence of the difference between the misclassification risk of the newly introduced convolutional neural network estimate and the minimal possible value is derived.