Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC
This addresses the problem of slow inference for practitioners using Bayesian Matrix Factorization, offering a faster and accurate method, though it is incremental as it builds on existing stochastic gradient and MCMC techniques.
The paper tackles the high computational cost of Bayesian Matrix Factorization by proposing a scalable distributed algorithm using stochastic gradient MCMC, achieving similar prediction accuracy to Gibbs sampling but an order of magnitude faster and reducing prediction error by 4.1% on Netflix and 1.8% on Yahoo music datasets.
Despite having various attractive qualities such as high prediction accuracy and the ability to quantify uncertainty and avoid over-fitting, Bayesian Matrix Factorization has not been widely adopted because of the prohibitive cost of inference. In this paper, we propose a scalable distributed Bayesian matrix factorization algorithm using stochastic gradient MCMC. Our algorithm, based on Distributed Stochastic Gradient Langevin Dynamics, can not only match the prediction accuracy of standard MCMC methods like Gibbs sampling, but at the same time is as fast and simple as stochastic gradient descent. In our experiments, we show that our algorithm can achieve the same level of prediction accuracy as Gibbs sampling an order of magnitude faster. We also show that our method reduces the prediction error as fast as distributed stochastic gradient descent, achieving a 4.1% improvement in RMSE for the Netflix dataset and an 1.8% for the Yahoo music dataset.