CL IR LGOct 30, 2023

Split-NER: Named Entity Recognition via Two Question-Answering-based Classifications

arXiv:2310.19942v126.4224 citationsh-index: 8Has Code

Originality Incremental advance

AI Analysis

This work addresses NER efficiency and accuracy for cross-domain applications, presenting an incremental improvement over existing QA-based methods.

The paper tackles Named Entity Recognition (NER) by splitting it into span detection and classification sub-tasks, formulated as question-answering problems, resulting in improved performance on datasets like OntoNotes5.0 and WNUT17 and reduced training time compared to baselines.

In this work, we address the NER problem by splitting it into two logical sub-tasks: (1) Span Detection which simply extracts entity mention spans irrespective of entity type; (2) Span Classification which classifies the spans into their entity types. Further, we formulate both sub-tasks as question-answering (QA) problems and produce two leaner models which can be optimized separately for each sub-task. Experiments with four cross-domain datasets demonstrate that this two-step approach is both effective and time efficient. Our system, SplitNER outperforms baselines on OntoNotes5.0, WNUT17 and a cybersecurity dataset and gives on-par performance on BioNLP13CG. In all cases, it achieves a significant reduction in training time compared to its QA baseline counterpart. The effectiveness of our system stems from fine-tuning the BERT model twice, separately for span detection and classification. The source code can be found at https://github.com/c3sr/split-ner.

View on arXiv PDF Code

Similar