ST LG MLJun 4, 2020

Rates of Convergence for Laplacian Semi-Supervised Learning with Low Labeling Rates

Jeff Calder, Dejan Slepčev, Matthew Thorpe

arXiv:2006.02765v114.237 citations

Originality Incremental advance

AI Analysis

This work addresses the theoretical understanding of graph-based learning methods for practitioners in machine learning and data science, providing insights into label efficiency, but it is incremental as it builds on prior analyses of degeneracy.

The paper tackles the degeneracy of Laplacian semi-supervised learning at low labeling rates by analyzing conditions under which the method becomes ill-posed or well-posed, proving that if the labeling rate is much smaller than the graph scale squared, spikes form, and if it is larger, the solution is consistent with a continuum Laplace equation with error estimates of O(εβ^{-1/2}) up to logarithmic factors.

We study graph-based Laplacian semi-supervised learning at low labeling rates. Laplacian learning uses harmonic extension on a graph to propagate labels. At very low label rates, Laplacian learning becomes degenerate and the solution is roughly constant with spikes at each labeled data point. Previous work has shown that this degeneracy occurs when the number of labeled data points is finite while the number of unlabeled data points tends to infinity. In this work we allow the number of labeled data points to grow to infinity with the number of labels. Our results show that for a random geometric graph with length scale $\varepsilon>0$ and labeling rate $β>0$, if $β\ll\varepsilon^2$ then the solution becomes degenerate and spikes form, and if $β\gg \varepsilon^2$ then Laplacian learning is well-posed and consistent with a continuum Laplace equation. Furthermore, in the well-posed setting we prove quantitative error estimates of $O(\varepsilonβ^{-1/2})$ for the difference between the solutions of the discrete problem and continuum PDE, up to logarithmic factors. We also study $p$-Laplacian regularization and show the same degeneracy result when $β\ll \varepsilon^p$. The proofs of our well-posedness results use the random walk interpretation of Laplacian learning and PDE arguments, while the proofs of the ill-posedness results use $Γ$-convergence tools from the calculus of variations. We also present numerical results on synthetic and real data to illustrate our results.

View on arXiv PDF

Similar