Label Propagation with Weak Supervision
This work addresses the need for efficient learning with limited labeled data in machine learning, though it is incremental as it builds on existing label propagation methods.
The paper tackles the problem of reducing labeled data demand by analyzing and extending the classical label propagation algorithm to incorporate probabilistic hypothesized labels from weak supervision, showing improvements on benchmark classification tasks.
Semi-supervised learning and weakly supervised learning are important paradigms that aim to reduce the growing demand for labeled data in current machine learning applications. In this paper, we introduce a novel analysis of the classical label propagation algorithm (LPA) (Zhu & Ghahramani, 2002) that moreover takes advantage of useful prior information, specifically probabilistic hypothesized labels on the unlabeled data. We provide an error bound that exploits both the local geometric properties of the underlying graph and the quality of the prior information. We also propose a framework to incorporate multiple sources of noisy information. In particular, we consider the setting of weak supervision, where our sources of information are weak labelers. We demonstrate the ability of our approach on multiple benchmark weakly supervised classification tasks, showing improvements upon existing semi-supervised and weakly supervised methods.