LGJul 4, 2023

Nonparametric Classification on Low Dimensional Manifolds using Overparameterized Convolutional Residual Networks

arXiv:2307.01649v33 citationsh-index: 43
Originality Incremental advance
AI Analysis

This provides theoretical justification for the advantage of overparameterized ConvResNets over conventional models, addressing a gap in understanding for machine learning practitioners.

The paper tackles the unexplained strong performance of overparameterized convolutional residual networks by analyzing ConvResNeXts trained with weight decay, showing they can adapt to smooth functions on low-dimensional manifolds and avoid the curse of dimensionality.

Convolutional residual neural networks (ConvResNets), though overparameterized, can achieve remarkable prediction performance in practice, which cannot be well explained by conventional wisdom. To bridge this gap, we study the performance of ConvResNeXts, which cover ConvResNets as a special case, trained with weight decay from the perspective of nonparametric classification. Our analysis allows for infinitely many building blocks in ConvResNeXts, and shows that weight decay implicitly enforces sparsity on these blocks. Specifically, we consider a smooth target function supported on a low-dimensional manifold, then prove that ConvResNeXts can adapt to the function smoothness and low-dimensional structures and efficiently learn the function without suffering from the curse of dimensionality. Our findings partially justify the advantage of overparameterized ConvResNeXts over conventional machine learning models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes