LGOCMay 22, 2024

Interpolation with deep neural networks with non-polynomial activations: necessary and sufficient numbers of neurons

arXiv:2405.13738v22 citationsh-index: 2
Originality Incremental advance
AI Analysis

This provides a theoretical guarantee for practitioners to freely choose activation functions without compromising interpolation capability, though it is incremental as it extends prior results to a broader class of functions.

The paper tackles the problem of determining the minimal number of neurons needed for a feedforward neural network to interpolate generic data points, proving that Θ(√(nd')) neurons are sufficient for activation functions that are real analytic and not polynomials, excluding only piecewise polynomials.

The minimal number of neurons required for a feedforward neural network to interpolate $n$ generic input-output pairs from $\mathbb{R}^d\times \mathbb{R}^{d'}$ is $Θ(\sqrt{nd'})$. While previous results have shown that $Θ(\sqrt{nd'})$ neurons are sufficient, they have been limited to sigmoid, Heaviside, and rectified linear unit (ReLU) as the activation function. Using a different approach, we prove that $Θ(\sqrt{nd'})$ neurons are sufficient as long as the activation function is real analytic at a point and not a polynomial there. Thus, the only practical activation functions that our result does not apply to are piecewise polynomials. Importantly, this means that activation functions can be freely chosen in a problem-dependent manner without loss of interpolation power.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes