Neural networks in non-metric spaces
This work addresses the challenge of applying neural networks to non-metric and infinite-dimensional spaces, which is incremental as it builds on prior research to expand theoretical scope and practical applicability.
The authors extended their infinite-dimensional neural network architecture to handle a broad class of input and output spaces, including quasi-Polish spaces and topological vector spaces, proving universal approximation theorems and ensuring numerical feasibility through finite-dimensional projections. They also demonstrated an obstruction result indicating quasi-Polish spaces as the optimal category for such architectures.
Leveraging the infinite dimensional neural network architecture we proposed in arXiv:2109.13512v4 and which can process inputs from Fréchet spaces, and using the universal approximation property shown therein, we now largely extend the scope of this architecture by proving several universal approximation theorems for a vast class of input and output spaces. More precisely, the input space $\mathfrak X$ is allowed to be a general topological space satisfying only a mild condition ("quasi-Polish"), and the output space can be either another quasi-Polish space $\mathfrak Y$ or a topological vector space $E$. Similarly to arXiv:2109.13512v4, we show furthermore that our neural network architectures can be projected down to "finite dimensional" subspaces with any desirable accuracy, thus obtaining approximating networks that are easy to implement and allow for fast computation and fitting. The resulting neural network architecture is therefore applicable for prediction tasks based on functional data. To the best of our knowledge, this is the first result which deals with such a wide class of input/output spaces and simultaneously guarantees the numerical feasibility of the ensuing architectures. Finally, we prove an obstruction result which indicates that the category of quasi-Polish spaces is in a certain sense the correct category to work with if one aims at constructing approximating architectures on infinite-dimensional spaces $\mathfrak X$ which, at the same time, have sufficient expressive power to approximate continuous functions on $\mathfrak X$, are specified by a finite number of parameters only and are "stable" with respect to these parameters.