MLLGJun 17, 2019

Smooth function approximation by deep neural networks with general activation functions

arXiv:1906.06903v293 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the expressivity of neural networks for researchers and practitioners by extending theoretical guarantees beyond specific activations like ReLU, though it is incremental in generalizing existing results.

The paper tackles the problem of approximating Hölder smooth functions using deep neural networks with a broad class of activation functions, deriving bounds on depth, width, and sparsity to achieve a given error and proving minimax optimality in regression and classification.

There has been a growing interest in expressivity of deep neural networks. However, most of the existing work about this topic focuses only on the specific activation function such as ReLU or sigmoid. In this paper, we investigate the approximation ability of deep neural networks with a broad class of activation functions. This class of activation functions includes most of frequently used activation functions. We derive the required depth, width and sparsity of a deep neural network to approximate any Hölder smooth function upto a given approximation error for the large class of activation functions. Based on our approximation error analysis, we derive the minimax optimality of the deep neural network estimators with the general activation functions in both regression and classification problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes