LGApr 14, 2022

ULF: Unsupervised Labeling Function Correction using Cross-Validation for Weak Supervision

arXiv:2204.06863v4132 citationsh-index: 19
Originality Incremental advance
AI Analysis

This work addresses noise reduction for practitioners using weak supervision, though it is incremental as it builds on existing cross-validation techniques.

The paper tackled noise in weak supervision by introducing ULF, an unsupervised algorithm that corrects labeling functions using cross-validation, resulting in enhanced weak supervision learning across multiple datasets without manual labeling.

A cost-effective alternative to manual data labeling is weak supervision (WS), where data samples are automatically annotated using a predefined set of labeling functions (LFs), rule-based mechanisms that generate artificial labels for the associated classes. In this work, we investigate noise reduction techniques for WS based on the principle of k-fold cross-validation. We introduce a new algorithm ULF for Unsupervised Labeling Function correction, which denoises WS data by leveraging models trained on all but some LFs to identify and correct biases specific to the held-out LFs. Specifically, ULF refines the allocation of LFs to classes by re-estimating this assignment on highly reliable cross-validated samples. Evaluation on multiple datasets confirms ULF's effectiveness in enhancing WS learning without the need for manual labeling.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes