CLLGMLNov 13, 2018

Few-shot Learning for Named Entity Recognition in Medical Text

arXiv:1811.05468v168 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the challenge of scarce annotated examples in medical NLP, offering a practical solution for domain-specific applications, though it is incremental in nature.

The paper tackled the problem of named entity recognition in medical text with limited annotated data, achieving an F1 score improvement from 69.3% to 78.87% using five improvements under few-shot conditions.

Deep neural network models have recently achieved state-of-the-art performance gains in a variety of natural language processing (NLP) tasks (Young, Hazarika, Poria, & Cambria, 2017). However, these gains rely on the availability of large amounts of annotated examples, without which state-of-the-art performance is rarely achievable. This is especially inconvenient for the many NLP fields where annotated examples are scarce, such as medical text. To improve NLP models in this situation, we evaluate five improvements on named entity recognition (NER) tasks when only ten annotated examples are available: (1) layer-wise initialization with pre-trained weights, (2) hyperparameter tuning, (3) combining pre-training data, (4) custom word embeddings, and (5) optimizing out-of-vocabulary (OOV) words. Experimental results show that the F1 score of 69.3% achievable by state-of-the-art models can be improved to 78.87%.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes