MLLGSTSep 26, 2023

On Excess Risk Convergence Rates of Neural Network Classifiers

arXiv:2309.15075v13 citationsh-index: 28
Originality Incremental advance
AI Analysis

This work provides theoretical guarantees for neural network classifiers in a more practical scenario, addressing a foundational problem in machine learning theory.

The paper analyzes the excess risk convergence rates of neural network classifiers in a binary classification setting, showing that they achieve a dimension-free, uniform convergence rate of n^{-1/3} up to a logarithmic factor, which is minimax optimal.

The recent success of neural networks in pattern recognition and classification problems suggests that neural networks possess qualities distinct from other more classical classifiers such as SVMs or boosting classifiers. This paper studies the performance of plug-in classifiers based on neural networks in a binary classification setting as measured by their excess risks. Compared to the typical settings imposed in the literature, we consider a more general scenario that resembles actual practice in two respects: first, the function class to be approximated includes the Barron functions as a proper subset, and second, the neural network classifier constructed is the minimizer of a surrogate loss instead of the $0$-$1$ loss so that gradient descent-based numerical optimizations can be easily applied. While the class of functions we consider is quite large that optimal rates cannot be faster than $n^{-\frac{1}{3}}$, it is a regime in which dimension-free rates are possible and approximation power of neural networks can be taken advantage of. In particular, we analyze the estimation and approximation properties of neural networks to obtain a dimension-free, uniform rate of convergence for the excess risk. Finally, we show that the rate obtained is in fact minimax optimal up to a logarithmic factor, and the minimax lower bound shows the effect of the margin assumption in this regime.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes