LGNENAPRMLApr 24, 2023

An Approximation Theory for Metric Space-Valued Functions With A View Towards Deep Learning

arXiv:2304.12231v26 citationsh-index: 47
Originality Highly original
AI Analysis

This work provides a foundational mathematical framework for deep learning, enabling approximation of functions in broader metric spaces, which is significant for applications in geometric deep learning and inverse problems.

The paper tackles the problem of approximating continuous functions between arbitrary Polish metric spaces, overcoming the limitation that previous methods required the target space to be a topological vector space, by using approximators that output discrete probability measures and showing that the number of required Dirac measures depends on the combinatorial structure of the spaces.

Motivated by the developing mathematics of deep learning, we build universal functions approximators of continuous maps between arbitrary Polish metric spaces $\mathcal{X}$ and $\mathcal{Y}$ using elementary functions between Euclidean spaces as building blocks. Earlier results assume that the target space $\mathcal{Y}$ is a topological vector space. We overcome this limitation by ``randomization'': our approximators output discrete probability measures over $\mathcal{Y}$. When $\mathcal{X}$ and $\mathcal{Y}$ are Polish without additional structure, we prove very general qualitative guarantees; when they have suitable combinatorial structure, we prove quantitative guarantees for Hölder-like maps, including maps between finite graphs, solution operators to rough differential equations between certain Carnot groups, and continuous non-linear operators between Banach spaces arising in inverse problems. In particular, we show that the required number of Dirac measures is determined by the combinatorial structure of $\mathcal{X}$ and $\mathcal{Y}$. For barycentric $\mathcal{Y}$, including Banach spaces, $\mathbb{R}$-trees, Hadamard manifolds, or Wasserstein spaces on Polish metric spaces, our approximators reduce to $\mathcal{Y}$-valued functions. When the Euclidean approximators are neural networks, our constructions generalize transformer networks, providing a new probabilistic viewpoint of geometric deep learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes