MLLGJun 17, 2025

Optimal Convergence Rates of Deep Neural Network Classifiers

arXiv:2506.14899v1h-index: 2
Originality Incremental advance
AI Analysis

This provides theoretical justification for the performance of ReLU DNNs in high-dimensional classification tasks, though it builds incrementally on prior work.

The paper tackles the binary classification problem under Tsybakov noise and compositional assumptions, proving an optimal convergence rate for excess 0-1 risk that is independent of input dimension and showing that ReLU deep neural networks with hinge loss can achieve this rate up to a logarithmic factor.

In this paper, we study the binary classification problem on $[0,1]^d$ under the Tsybakov noise condition (with exponent $s \in [0,\infty]$) and the compositional assumption. This assumption requires the conditional class probability function of the data distribution to be the composition of $q+1$ vector-valued multivariate functions, where each component function is either a maximum value function or a Hölder-$β$ smooth function that depends only on $d_*$ of its input variables. Notably, $d_*$ can be significantly smaller than the input dimension $d$. We prove that, under these conditions, the optimal convergence rate for the excess 0-1 risk of classifiers is $$ \left( \frac{1}{n} \right)^{\frac{β\cdot(1\wedgeβ)^q}{{\frac{d_*}{s+1}+(1+\frac{1}{s+1})\cdotβ\cdot(1\wedgeβ)^q}}}\;\;\;, $$ which is independent of the input dimension $d$. Additionally, we demonstrate that ReLU deep neural networks (DNNs) trained with hinge loss can achieve this optimal convergence rate up to a logarithmic factor. This result provides theoretical justification for the excellent performance of ReLU DNNs in practical classification tasks, particularly in high-dimensional settings. The technique used to establish these results extends the oracle inequality presented in our previous work. The generalized approach is of independent interest.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes