Partition Functions from Rao-Blackwellized Tempered Sampling
This work addresses a key challenge in model evaluation and comparison for machine learning practitioners, offering an incremental improvement in efficiency and accuracy for specific applications like RBMs.
The paper tackles the problem of computing partition functions for complex multimodal distributions, presenting a new method that uses Rao-Blackwellized tempered sampling to provide more accurate estimates than Annealed Importance Sampling for large Restricted Boltzmann Machines, with minimal computational cost.
Partition functions of probability distributions are important quantities for model evaluation and comparisons. We present a new method to compute partition functions of complex and multimodal distributions. Such distributions are often sampled using simulated tempering, which augments the target space with an auxiliary inverse temperature variable. Our method exploits the multinomial probability law of the inverse temperatures, and provides estimates of the partition function in terms of a simple quotient of Rao-Blackwellized marginal inverse temperature probability estimates, which are updated while sampling. We show that the method has interesting connections with several alternative popular methods, and offers some significant advantages. In particular, we empirically find that the new method provides more accurate estimates than Annealed Importance Sampling when calculating partition functions of large Restricted Boltzmann Machines (RBM); moreover, the method is sufficiently accurate to track training and validation log-likelihoods during learning of RBMs, at minimal computational cost.