STAT-MECH SI SOC-PH MLNov 1, 2019

Phase transitions and optimal algorithms for semi-supervised classifications on graphs: from belief propagation to graph convolution network

arXiv:1911.00197v2

Originality Highly original

AI Analysis

This work addresses the evaluation and theoretical understanding of graph convolution neural networks by providing well-controlled benchmarks and optimal solutions, with practical improvements for semi-supervised classification tasks.

The paper tackles the problem of clustering and semi-supervised classification on graphs with relational and feature information, identifying a phase transition that limits all algorithms and proposing a belief propagation algorithm that is asymptotically optimal and extends to a graph convolution network (BPGCN) that overcomes sparsity and overfitting issues, achieving extraordinary classification performances on real-world datasets.

We perform theoretical and algorithmic studies for the problem of clustering and semi-supervised classification on graphs with both pairwise relational information and single-point feature information, upon a joint stochastic block model for generating synthetic graphs with both edges and node features. Asymptotically exact analysis based on the Bayesian inference of the underlying model are conducted, using the cavity method in statistical physics. Theoretically, we identify a phase transition of the generative model, which puts fundamental limits on the ability of all possible algorithms in the clustering task of the underlying model. Algorithmically, we propose a belief propagation algorithm that is asymptotically optimal on the generative model, and can be further extended to a belief propagation graph convolution neural network (BPGCN) for semi-supervised classification on graphs. For the first time, well-controlled benchmark datasets with asymptotially exact properties and optimal solutions could be produced for the evaluation of graph convolution neural networks, and for the theoretical understanding of their strengths and weaknesses. In particular, on these synthetic benchmark networks we observe that existing graph convolution neural networks are subject to an sparsity issue and an ovefitting issue in practice, both of which are successfully overcome by our BPGCN. Moreover, when combined with classic neural network methods, BPGCN yields extraordinary classification performances on some real-world datasets that have never been achieved before.

View on arXiv PDF

Similar