On the capacity of deep generative networks for approximating distributions
This work addresses theoretical foundations for generative modeling in machine learning, providing insights into the capacity and limitations of deep networks for distribution approximation, which is incremental but clarifies key metrics.
The paper tackles the problem of approximating probability distributions using deep generative networks, proving that these networks can transform low-dimensional source distributions to closely match high-dimensional target distributions under Wasserstein distances and maximum mean discrepancy, with error bounds depending on network architecture and intrinsic dimensions, but showing limitations when using f-divergences.
We study the efficacy and efficiency of deep generative networks for approximating probability distributions. We prove that neural networks can transform a low-dimensional source distribution to a distribution that is arbitrarily close to a high-dimensional target distribution, when the closeness are measured by Wasserstein distances and maximum mean discrepancy. Upper bounds of the approximation error are obtained in terms of the width and depth of neural network. Furthermore, it is shown that the approximation error in Wasserstein distance grows at most linearly on the ambient dimension and that the approximation order only depends on the intrinsic dimension of the target distribution. On the contrary, when $f$-divergences are used as metrics of distributions, the approximation property is different. We show that in order to approximate the target distribution in $f$-divergences, the dimension of the source distribution cannot be smaller than the intrinsic dimension of the target distribution.