CVLGSPJan 12, 2023

Graph Laplacian for Semi-Supervised Learning

arXiv:2301.04956v28 citationsh-index: 25
Originality Incremental advance
AI Analysis

This addresses the problem of limited labeled data in semi-supervised learning for machine learning practitioners, though it appears incremental as it modifies an existing operator rather than introducing a new paradigm.

The paper tackles the suboptimal performance of graph-Laplacian methods in semi-supervised learning as supervision decreases by proposing a new graph-Laplacian based on density and contrastive measures that encodes labeled data directly, enabling successful semi-supervised learning via spectral clustering.

Semi-supervised learning is highly useful in common scenarios where labeled data is scarce but unlabeled data is abundant. The graph (or nonlocal) Laplacian is a fundamental smoothing operator for solving various learning tasks. For unsupervised clustering, a spectral embedding is often used, based on graph-Laplacian eigenvectors. For semi-supervised problems, the common approach is to solve a constrained optimization problem, regularized by a Dirichlet energy, based on the graph-Laplacian. However, as supervision decreases, Dirichlet optimization becomes suboptimal. We therefore would like to obtain a smooth transition between unsupervised clustering and low-supervised graph-based classification. In this paper, we propose a new type of graph-Laplacian which is adapted for Semi-Supervised Learning (SSL) problems. It is based on both density and contrastive measures and allows the encoding of the labeled data directly in the operator. Thus, we can perform successfully semi-supervised learning using spectral clustering. The benefits of our approach are illustrated for several SSL problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes