Universal Approximation of Markov Kernels by Shallow Stochastic Feedforward Networks
This provides a theoretical guarantee for the expressiveness of shallow stochastic networks in probabilistic modeling, which is incremental as it builds on existing universal approximation results.
The paper tackles the problem of approximating Markov kernels with shallow stochastic feedforward networks, showing that a network with a single hidden layer and sigmoid activations can achieve universal approximation using at most 2^{k-1}(2^{n-1}-1) hidden units for k inputs and n outputs.
We establish upper bounds for the minimal number of hidden units for which a binary stochastic feedforward network with sigmoid activation probabilities and a single hidden layer is a universal approximator of Markov kernels. We show that each possible probabilistic assignment of the states of $n$ output units, given the states of $k\geq1$ input units, can be approximated arbitrarily well by a network with $2^{k-1}(2^{n-1}-1)$ hidden units.