ML LG NA OCJun 18, 2019

Consistency of semi-supervised learning algorithms on graphs: Probit and one-hot methods

Franca Hoffmann, Bamdad Hosseini, Zhi Ren, Andrew M. Stuart

arXiv:1906.07658v26.425 citations

Originality Synthesis-oriented

AI Analysis

This work addresses theoretical consistency for graph-based semi-supervised learning, which is incremental as it builds on existing methods.

The paper tackles the consistency of optimization-based graph semi-supervised learning algorithms for label propagation, analyzing probit and one-hot methods in the limit of small label noise and well-clustered data, with results providing insights into rational function choices.

Graph-based semi-supervised learning is the problem of propagating labels from a small number of labelled data points to a larger set of unlabelled data. This paper is concerned with the consistency of optimization-based techniques for such problems, in the limit where the labels have small noise and the underlying unlabelled data is well clustered. We study graph-based probit for binary classification, and a natural generalization of this method to multi-class classification using one-hot encoding. The resulting objective function to be optimized comprises the sum of a quadratic form defined through a rational function of the graph Laplacian, involving only the unlabelled data, and a fidelity term involving only the labelled data. The consistency analysis sheds light on the choice of the rational function defining the optimization.

View on arXiv PDF

Similar