What's in a Name? Are BERT Named Entity Representations just as Good for any other Name?
This addresses the problem of brittleness in named entity recognition for NLP practitioners, though it is incremental as it builds on existing entity-aware models.
The paper investigates the robustness of BERT-based named entity representations to replacements from the same class, finding that state-of-the-art models are surprisingly brittle, and proposes a simple ensembling method that improves robustness and accuracy across three NLP tasks.
We evaluate named entity representations of BERT-based NLP models by investigating their robustness to replacements from the same typed class in the input. We highlight that on several tasks while such perturbations are natural, state of the art trained models are surprisingly brittle. The brittleness continues even with the recent entity-aware BERT models. We also try to discern the cause of this non-robustness, considering factors such as tokenization and frequency of occurrence. Then we provide a simple method that ensembles predictions from multiple replacements while jointly modeling the uncertainty of type annotations and label predictions. Experiments on three NLP tasks show that our method enhances robustness and increases accuracy on both natural and adversarial datasets.