CLLGMar 1, 2022

DAMO-NLP at SemEval-2022 Task 11: A Knowledge-based System for Multilingual Named Entity Recognition

arXiv:2203.00545v3634 citationsh-index: 42
AI Analysis

This addresses the problem of multilingual named entity recognition for NLP practitioners, but it is incremental as it builds on existing knowledge-based methods.

The paper tackled the challenge of recognizing ambiguous named entities in low-context settings by building a multilingual knowledge base from Wikipedia to augment input sentences with context, and their system won 10 out of 13 tracks in the MultiCoNER shared task.

The MultiCoNER shared task aims at detecting semantically ambiguous and complex named entities in short and low-context settings for multiple languages. The lack of contexts makes the recognition of ambiguous named entities challenging. To alleviate this issue, our team DAMO-NLP proposes a knowledge-based system, where we build a multilingual knowledge base based on Wikipedia to provide related context information to the named entity recognition (NER) model. Given an input sentence, our system effectively retrieves related contexts from the knowledge base. The original input sentences are then augmented with such context information, allowing significantly better contextualized token representations to be captured. Our system wins 10 out of 13 tracks in the MultiCoNER shared task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes