ML LG PR COOct 20, 2017

On the Consistency of Graph-based Bayesian Learning and the Scalability of Sampling Algorithms

Nicolas Garcia Trillos, Zachary Kaplan, Thabo Samakhoana, Daniel Sanz-Alonso

arXiv:1710.07702v27.623 citations

Originality Highly original

AI Analysis

This provides theoretical guarantees for scalable semi-supervised learning methods, addressing a key bottleneck in large-scale applications.

The paper tackles the problem of ensuring consistency in graph-based Bayesian learning as unlabeled data grows, and proves that this leads to scalable MCMC algorithms with uniform spectral gaps, supported by numerical experiments.

A popular approach to semi-supervised learning proceeds by endowing the input data with a graph structure in order to extract geometric information and incorporate it into a Bayesian framework. We introduce new theory that gives appropriate scalings of graph parameters that provably lead to a well-defined limiting posterior as the size of the unlabeled data set grows. Furthermore, we show that these consistency results have profound algorithmic implications. When consistency holds, carefully designed graph-based Markov chain Monte Carlo algorithms are proved to have a uniform spectral gap, independent of the number of unlabeled inputs. Several numerical experiments corroborate both the statistical consistency and the algorithmic scalability established by the theory.

View on arXiv PDF

Similar