CLLGMay 5, 2023

Uncertainty-Aware Bootstrap Learning for Joint Extraction on Distantly-Supervised Data

arXiv:2305.03827v2224 citations
Originality Incremental advance
AI Analysis

This addresses noisy label issues in distantly-supervised joint extraction, offering incremental improvements for natural language processing tasks.

The paper tackles the problem of joint entity and relation extraction on distantly-supervised data with noisy labels by proposing uncertainty-aware bootstrap learning, which uses instance-level uncertainty and self-ensembling to filter noise and improve performance, achieving results that outperform existing baselines on two large datasets.

Jointly extracting entity pairs and their relations is challenging when working on distantly-supervised data with ambiguous or noisy labels. To mitigate such impact, we propose uncertainty-aware bootstrap learning, which is motivated by the intuition that the higher uncertainty of an instance, the more likely the model confidence is inconsistent with the ground truths. Specifically, we first explore instance-level data uncertainty to create an initial high-confident examples. Such subset serves as filtering noisy instances and facilitating the model to converge fast at the early stage. During bootstrap learning, we propose self-ensembling as a regularizer to alleviate inter-model uncertainty produced by noisy labels. We further define probability variance of joint tagging probabilities to estimate inner-model parametric uncertainty, which is used to select and build up new reliable training instances for the next iteration. Experimental results on two large datasets reveal that our approach outperforms existing strong baselines and related methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes