ML LGSep 7, 2021

Besov Function Approximation and Binary Classification on Low-Dimensional Manifolds Using Convolutional Residual Networks

Hao Liu, Minshuo Chen, Tuo Zhao, Wenjing Liao

arXiv:2109.02832v219.045 citations

Originality Highly original

AI Analysis

This work addresses the gap between theoretical sample complexity and empirical success in deep learning for researchers, providing a foundation for understanding neural networks on high-dimensional data with low-dimensional structures.

The paper tackles the problem of high sample complexity in deep learning by exploiting low-dimensional geometric structures of data, showing that convolutional residual networks can approximate Besov functions on manifolds and achieve binary classification with excess risk depending on intrinsic dimension rather than data dimension.

Most of existing statistical theories on deep neural networks have sample complexities cursed by the data dimension and therefore cannot well explain the empirical success of deep learning on high-dimensional data. To bridge this gap, we propose to exploit low-dimensional geometric structures of the real world data sets. We establish theoretical guarantees of convolutional residual networks (ConvResNet) in terms of function approximation and statistical estimation for binary classification. Specifically, given the data lying on a $d$-dimensional manifold isometrically embedded in $\mathbb{R}^D$, we prove that if the network architecture is properly chosen, ConvResNets can (1) approximate Besov functions on manifolds with arbitrary accuracy, and (2) learn a classifier by minimizing the empirical logistic risk, which gives an excess risk in the order of $n^{-\frac{s}{2s+2(s\vee d)}}$, where $s$ is a smoothness parameter. This implies that the sample complexity depends on the intrinsic dimension $d$, instead of the data dimension $D$. Our results demonstrate that ConvResNets are adaptive to low-dimensional structures of data sets.

View on arXiv PDF

Similar