Radius-margin bounds for deep neural networks
This work provides theoretical insights into deep learning generalization, which is a foundational problem for ML researchers, though it appears incremental as it adapts existing SVM bounds to deep networks.
The paper tackles the problem of explaining deep learning's effectiveness by deriving VC bounds for deep feed-forward architectures using radius-margin bounds from SVMs, and relates techniques like Dropout to reducing network capacity.
Explaining the unreasonable effectiveness of deep learning has eluded researchers around the globe. Various authors have described multiple metrics to evaluate the capacity of deep architectures. In this paper, we allude to the radius margin bounds described for a support vector machine (SVM) with hinge loss, apply the same to the deep feed-forward architectures and derive the Vapnik-Chervonenkis (VC) bounds which are different from the earlier bounds proposed in terms of number of weights of the network. In doing so, we also relate the effectiveness of techniques like Dropout and Dropconnect in bringing down the capacity of the network. Finally, we describe the effect of maximizing the input as well as the output margin to achieve an input noise-robust deep architecture.