LG MLFeb 27, 2018

ADMM-based Networked Stochastic Variational Inference

arXiv:1802.10168v1

Originality Incremental advance

AI Analysis

This work addresses the need for scalable and secure Bayesian inference in distributed settings, such as deep learning over networks, but it is incremental as it extends existing distributed SVI methods to a networked framework.

The paper tackles the problem of decentralizing stochastic variational inference (SVI) to enable parallel computation and secure learning by developing an ADMM-based networked SVI algorithm, where independent learners share model parameters instead of private data, and demonstrates its performance on a topic-modeling task with Wikipedia articles, showing competitive results compared to a centralized algorithm.

Owing to the recent advances in "Big Data" modeling and prediction tasks, variational Bayesian estimation has gained popularity due to their ability to provide exact solutions to approximate posteriors. One key technique for approximate inference is stochastic variational inference (SVI). SVI poses variational inference as a stochastic optimization problem and solves it iteratively using noisy gradient estimates. It aims to handle massive data for predictive and classification tasks by applying complex Bayesian models that have observed as well as latent variables. This paper aims to decentralize it allowing parallel computation, secure learning and robustness benefits. We use Alternating Direction Method of Multipliers in a top-down setting to develop a distributed SVI algorithm such that independent learners running inference algorithms only require sharing the estimated model parameters instead of their private datasets. Our work extends the distributed SVI-ADMM algorithm that we first propose, to an ADMM-based networked SVI algorithm in which not only are the learners working distributively but they share information according to rules of a graph by which they form a network. This kind of work lies under the umbrella of `deep learning over networks' and we verify our algorithm for a topic-modeling problem for corpus of Wikipedia articles. We illustrate the results on latent Dirichlet allocation (LDA) topic model in large document classification, compare performance with the centralized algorithm, and use numerical experiments to corroborate the analytical results.

View on arXiv PDF

Similar