On Transformations in Stochastic Gradient MCMC
This addresses a practical limitation in Bayesian inference for models with bounded variables, though it is incremental as it builds on existing SGLD methods.
The paper tackled the problem of erroneous sampling in stochastic gradient MCMC for bounded variables, showing that using an invertible Lipschitz mapping function corrects this issue and achieves weak convergence, with experiments validating its efficacy in models like Bayesian non-negative matrix factorization and binary neural networks.
Stochastic gradient Langevin dynamics (SGLD) is a computationally efficient sampler for Bayesian posterior inference given a large scale dataset. Although SGLD is designed for unbounded random variables, many practical models incorporate variables with boundaries such as non-negative ones or those in a finite interval. To bridge this gap, we consider mapping unbounded samples into the target interval. This paper reveals that several mapping approaches commonly used in the literature produces erroneous samples from theoretical and empirical perspectives. We show that the change of random variable using an invertible Lipschitz mapping function overcomes the pitfall as well as attains the weak convergence. Experiments demonstrate its efficacy for widely-used models with bounded latent variables including Bayesian non-negative matrix factorization and binary neural networks.