LGCLMLApr 13, 2020

Learning from Rules Generalizing Labeled Exemplars

arXiv:2004.06025v293 citations
AI Analysis

This addresses the challenge of data labeling for machine learning applications where human supervision is costly, though it appears incremental in improving existing mixed supervision techniques.

The paper tackles the problem of collecting human supervision when labeled data is scarce by proposing a rule-exemplar method that combines rule efficiency with label quality, resulting in improved accuracy over existing methods on five tasks.

In many applications labeled data is not readily available, and needs to be collected via pain-staking human supervision. We propose a rule-exemplar method for collecting human supervision to combine the efficiency of rules with the quality of instance labels. The supervision is coupled such that it is both natural for humans and synergistic for learning. We propose a training algorithm that jointly denoises rules via latent coverage variables, and trains the model through a soft implication loss over the coverage and label variables. The denoised rules and trained model are used jointly for inference. Empirical evaluation on five different tasks shows that (1) our algorithm is more accurate than several existing methods of learning from a mix of clean and noisy supervision, and (2) the coupled rule-exemplar supervision is effective in denoising rules.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes