LGMLMay 28, 2018

Universality of Deep Convolutional Neural Networks

arXiv:1805.10769v2587 citations
Originality Highly original
AI Analysis

This provides a foundational theoretical result for deep learning, addressing an open question in learning theory and verifying the efficiency of CNNs for high-dimensional data.

The paper tackles the lack of theoretical foundation for deep convolutional neural networks (CNNs) by proving their universality, showing they can approximate any continuous function to arbitrary accuracy with sufficient depth, with quantitative estimates based on free parameters.

Deep learning has been widely applied and brought breakthroughs in speech recognition, computer vision, and many other domains. The involved deep neural network architectures and computational issues have been well studied in machine learning. But there lacks a theoretical foundation for understanding the approximation or generalization ability of deep learning methods generated by the network architectures such as deep convolutional neural networks having convolutional structures. Here we show that a deep convolutional neural network (CNN) is universal, meaning that it can be used to approximate any continuous function to an arbitrary accuracy when the depth of the neural network is large enough. This answers an open question in learning theory. Our quantitative estimate, given tightly in terms of the number of free parameters to be computed, verifies the efficiency of deep CNNs in dealing with large dimensional data. Our study also demonstrates the role of convolutions in deep CNNs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes