Parallel and Distributed Approaches for Graph Based Semi-supervised Learning
This work addresses efficient classification in graph-based learning for scenarios requiring scalability, though it appears incremental as it builds on existing graph and MCMC methods.
The paper tackles graph-based semi-supervised learning by proposing two parallel and distributed approaches: an affine map iteration using sparse matrix-vector multiplication and an MCMC-based random walk sampling algorithm, achieving very small classification errors and demonstrating the sampling algorithm's ability to handle new incoming nodes.
Two approaches for graph based semi-supervised learning are proposed. The firstapproach is based on iteration of an affine map. A key element of the affine map iteration is sparsematrix-vector multiplication, which has several very efficient parallel implementations. The secondapproach belongs to the class of Markov Chain Monte Carlo (MCMC) algorithms. It is based onsampling of nodes by performing a random walk on the graph. The latter approach is distributedby its nature and can be easily implemented on several processors or over the network. Boththeoretical and practical evaluations are provided. It is found that the nodes are classified intotheir class with very small error. The sampling algorithm's ability to track new incoming nodesand to classify them is also demonstrated.