Statistically guided deep learning
This provides a statistically rigorous deep learning method for regression problems, though it appears incremental in combining existing theoretical insights.
The authors developed a theoretically-grounded deep learning algorithm for nonparametric regression using over-parametrized neural networks with logistic activation, gradient descent, and specific design choices for topology, initialization, and hyperparameters. They proved an L2 error bound and demonstrated improved finite sample performance on simulated data.
We present a theoretically well-founded deep learning algorithm for nonparametric regression. It uses over-parametrized deep neural networks with logistic activation function, which are fitted to the given data via gradient descent. We propose a special topology of these networks, a special random initialization of the weights, and a data-dependent choice of the learning rate and the number of gradient descent steps. We prove a theoretical bound on the expected $L_2$ error of this estimate, and illustrate its finite sample size performance by applying it to simulated data. Our results show that a theoretical analysis of deep learning which takes into account simultaneously optimization, generalization and approximation can result in a new deep learning estimate which has an improved finite sample performance.