Geometric separation and constructive universal approximation with two hidden layers
This provides a constructive theoretical foundation for neural network approximation, addressing a core problem in machine learning theory, though it is incremental in building on existing universal approximation theorems.
The paper tackles the problem of constructing neural networks that can separate disjoint compact subsets in ℝⁿ and achieve universal approximation, showing that networks with two hidden layers and sigmoidal or ReLU activations can approximate any continuous function on compact sets to any accuracy in the uniform norm, with a simplified depth-2 result for finite sets.
We give a geometric construction of neural networks that separate disjoint compact subsets of $\Bbb R^n$, and use it to obtain a constructive universal approximation theorem. Specifically, we show that networks with two hidden layers and either a sigmoidal activation (i.e., strictly monotone bounded continuous) or the ReLU activation can approximate any real-valued continuous function on an arbitrary compact set $K\subset\Bbb R^n$ to any prescribed accuracy in the uniform norm. For finite $K$, the construction simplifies and yields a sharp depth-2 (single hidden layer) approximation result.