CLAILGMar 26, 2024

ELLEN: Extremely Lightly Supervised Learning For Efficient Named Entity Recognition

arXiv:2403.17385v181 citationsh-index: 5Has CodeLREC
Originality Incremental advance
AI Analysis

This work addresses efficient NER for NLP applications with minimal labeled data, offering a practical solution that is incremental by blending existing techniques.

The paper tackles the problem of semi-supervised named entity recognition with extremely light supervision, using only 10 examples per class, and introduces ELLEN, a neuro-symbolic method that achieves strong performance on CoNLL-2003, outperforms many existing methods with 5% training data, and in zero-shot scenarios, beats GPT-3.5 and matches GPT-4 on WNUT-17 while reaching over 75% of a fully supervised model's performance.

In this work, we revisit the problem of semi-supervised named entity recognition (NER) focusing on extremely light supervision, consisting of a lexicon containing only 10 examples per class. We introduce ELLEN, a simple, fully modular, neuro-symbolic method that blends fine-tuned language models with linguistic rules. These rules include insights such as ''One Sense Per Discourse'', using a Masked Language Model as an unsupervised NER, leveraging part-of-speech tags to identify and eliminate unlabeled entities as false negatives, and other intuitions about classifier confidence scores in local and global context. ELLEN achieves very strong performance on the CoNLL-2003 dataset when using the minimal supervision from the lexicon above. It also outperforms most existing (and considerably more complex) semi-supervised NER methods under the same supervision settings commonly used in the literature (i.e., 5% of the training data). Further, we evaluate our CoNLL-2003 model in a zero-shot scenario on WNUT-17 where we find that it outperforms GPT-3.5 and achieves comparable performance to GPT-4. In a zero-shot setting, ELLEN also achieves over 75% of the performance of a strong, fully supervised model trained on gold data. Our code is available at: https://github.com/hriaz17/ELLEN.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes