ML LGMay 13, 2024

Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent

Michael Kohler, Adam Krzyzak, Benjamin Walter

arXiv:2405.07619v13.11 citationsh-index: 44J Stat Plan Inference

Originality Synthesis-oriented

AI Analysis

This work provides theoretical insights into the convergence behavior of over-parametrized CNNs, which is incremental for researchers in deep learning theory.

The paper tackles the problem of analyzing the convergence rate of an over-parametrized convolutional neural network trained with gradient descent for image classification, deriving a bound on the difference between the misclassification risk of the estimate and the minimal possible value.

Image classification based on over-parametrized convolutional neural networks with a global average-pooling layer is considered. The weights of the network are learned by gradient descent. A bound on the rate of convergence of the difference between the misclassification risk of the newly introduced convolutional neural network estimate and the minimal possible value is derived.

View on arXiv PDF

Similar