LGSTMLJul 29, 2020

Almost exact recovery in noisy semi-supervised learning

arXiv:2007.14717v45 citations
AI Analysis

This work addresses classification accuracy in noisy semi-supervised learning, which is incremental as it builds on existing graph-based methods with a focus on robustness to label noise.

The paper tackles the problem of graph-based semi-supervised learning with noisy labeled data by deriving a MAP estimator for clustering in a DC-SBM and proposing a consistent algorithm; numerical experiments show promising performance even with high noise levels.

Graph-based semi-supervised learning methods combine the graph structure and labeled data to classify unlabeled data. In this work, we study the effect of a noisy oracle on classification. In particular, we derive the Maximum A Posteriori (MAP) estimator for clustering a Degree Corrected Stochastic Block Model (DC-SBM) when a noisy oracle reveals a fraction of the labels. We then propose an algorithm derived from a continuous relaxation of the MAP, and we establish its consistency. Numerical experiments show that our approach achieves promising performance on synthetic and real data sets, even in the case of very noisy labeled data.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes