FALGMLMar 29, 2023

Optimal approximation using complex-valued neural networks

arXiv:2303.16813v210 citationsh-index: 2
Originality Incremental advance
AI Analysis

This work provides foundational theoretical insights for researchers in machine learning and applied mathematics, addressing a gap in understanding CVNNs, though it is incremental in extending real-valued theory to the complex domain.

The paper tackles the lack of mathematical foundation for complex-valued neural networks (CVNNs) by analyzing their approximation properties, deriving the first quantitative bounds for a wide class of activation functions and showing an optimal error scaling of m^{-k/(2n)} for approximating C^k-functions, where m is the number of neurons, k is smoothness, and n is input dimension.

Complex-valued neural networks (CVNNs) have recently shown promising empirical success, for instance for increasing the stability of recurrent neural networks and for improving the performance in tasks with complex-valued inputs, such as in MRI fingerprinting. While the overwhelming success of Deep Learning in the real-valued case is supported by a growing mathematical foundation, such a foundation is still largely lacking in the complex-valued case. We thus analyze the expressivity of CVNNs by studying their approximation properties. Our results yield the first quantitative approximation bounds for CVNNs that apply to a wide class of activation functions including the popular modReLU and complex cardioid activation functions. Precisely, our results apply to any activation function that is smooth but not polyharmonic on some non-empty open set; this is the natural generalization of the class of smooth and non-polynomial activation functions to the complex setting. Our main result shows that the error for the approximation of $C^k$-functions scales as $m^{-k/(2n)}$ for $m \to \infty$ where $m$ is the number of neurons, $k$ the smoothness of the target function and $n$ is the (complex) input dimension. Under a natural continuity assumption, we show that this rate is optimal; we further discuss the optimality when dropping this assumption. Moreover, we prove that the problem of approximating $C^k$-functions using continuous approximation methods unavoidably suffers from the curse of dimensionality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes