Learning from Rules Generalizing Labeled Exemplars
This addresses the challenge of data labeling for machine learning applications where human supervision is costly, though it appears incremental in improving existing mixed supervision techniques.
The paper tackles the problem of collecting human supervision when labeled data is scarce by proposing a rule-exemplar method that combines rule efficiency with label quality, resulting in improved accuracy over existing methods on five tasks.
In many applications labeled data is not readily available, and needs to be collected via pain-staking human supervision. We propose a rule-exemplar method for collecting human supervision to combine the efficiency of rules with the quality of instance labels. The supervision is coupled such that it is both natural for humans and synergistic for learning. We propose a training algorithm that jointly denoises rules via latent coverage variables, and trains the model through a soft implication loss over the coverage and label variables. The denoised rules and trained model are used jointly for inference. Empirical evaluation on five different tasks shows that (1) our algorithm is more accurate than several existing methods of learning from a mix of clean and noisy supervision, and (2) the coupled rule-exemplar supervision is effective in denoising rules.