CLAILGJun 28, 2020

BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision

arXiv:2006.15509v1261 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the problem of reducing annotation costs for NER in various domains, though it is incremental as it builds on existing methods like BERT and self-training.

The paper tackles open-domain named entity recognition under distant supervision, which produces noisy labels, by proposing BOND, a two-stage framework using pre-trained language models and self-training, achieving superior performance on 5 benchmark datasets.

We study the open-domain named entity recognition (NER) problem under distant supervision. The distant supervision, though does not require large amounts of manual annotations, yields highly incomplete and noisy distant labels via external knowledge bases. To address this challenge, we propose a new computational framework -- BOND, which leverages the power of pre-trained language models (e.g., BERT and RoBERTa) to improve the prediction performance of NER models. Specifically, we propose a two-stage training algorithm: In the first stage, we adapt the pre-trained language model to the NER tasks using the distant labels, which can significantly improve the recall and precision; In the second stage, we drop the distant labels, and propose a self-training approach to further improve the model performance. Thorough experiments on 5 benchmark datasets demonstrate the superiority of BOND over existing distantly supervised NER methods. The code and distantly labeled data have been released in https://github.com/cliang1453/BOND.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes