CLAILGJul 14, 2022

Multilinguals at SemEval-2022 Task 11: Complex NER in Semantically Ambiguous Settings for Low Resource Languages

arXiv:2207.06882v1627 citationsh-index: 22Has Code
Originality Incremental advance
AI Analysis

This work addresses complex NER in semantically ambiguous settings for low-resource languages, representing an incremental advancement.

The authors tackled complex named entity recognition for low-resource Chinese and Spanish by leveraging pre-trained language models with Whole Word Masking and various neural architectures, achieving significant improvements over the baseline and competitive performance on the evaluation leaderboard.

We leverage pre-trained language models to solve the task of complex NER for two low-resource languages: Chinese and Spanish. We use the technique of Whole Word Masking(WWM) to boost the performance of masked language modeling objective on large and unsupervised corpora. We experiment with multiple neural network architectures, incorporating CRF, BiLSTMs, and Linear Classifiers on top of a fine-tuned BERT layer. All our models outperform the baseline by a significant margin and our best performing model obtains a competitive position on the evaluation leaderboard for the blind test set.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes