ML LGJun 22, 2018

Tensor Monte Carlo: particle methods for the GPU era

arXiv:1806.08593v314 citations

Originality Incremental advance

AI Analysis

This addresses a computational bottleneck in variational inference for researchers and practitioners, though it is an incremental improvement over existing methods.

The paper tackles the poor scaling of importance-weighted variational autoencoders (IWAE) in high-dimensional latent spaces by proposing tensor Monte-Carlo (TMC), which efficiently computes exponentially many importance samples through tensor operations. The result shows TMC outperforms IWAE on a generative model trained on MNIST, with improved scalability and compatibility with variance reduction techniques.

Multi-sample, importance-weighted variational autoencoders (IWAE) give tighter bounds and more accurate uncertainty estimates than variational autoencoders (VAE) trained with a standard single-sample objective. However, IWAEs scale poorly: as the latent dimensionality grows, they require exponentially many samples to retain the benefits of importance weighting. While sequential Monte-Carlo (SMC) can address this problem, it is prohibitively slow because the resampling step imposes sequential structure which cannot be parallelised, and moreover, resampling is non-differentiable which is problematic when learning approximate posteriors. To address these issues, we developed tensor Monte-Carlo (TMC) which gives exponentially many importance samples by separately drawing $K$ samples for each of the $n$ latent variables, then averaging over all $K^n$ possible combinations. While the sum over exponentially many terms might seem to be intractable, in many cases it can be computed efficiently as a series of tensor inner-products. We show that TMC is superior to IWAE on a generative model with multiple stochastic layers trained on the MNIST handwritten digit database, and we show that TMC can be combined with standard variance reduction techniques.

View on arXiv PDF

Similar