CLApr 29, 2017

Semi-supervised sequence tagging with bidirectional language models

arXiv:1705.00108v1665 citations
Originality Highly original
AI Analysis

This addresses the data scarcity problem for NLP researchers and practitioners working on sequence labeling tasks like NER and chunking, representing a significant but not paradigm-shifting advance.

The paper tackles the problem of limited labeled data for sequence labeling tasks by proposing a semi-supervised approach that adds pre-trained context embeddings from bidirectional language models to NLP systems. The result is state-of-the-art performance on named entity recognition and chunking datasets, surpassing previous systems that used additional labeled data or task-specific resources.

Pre-trained word embeddings learned from unlabeled text have become a standard component of neural network architectures for NLP tasks. However, in most cases, the recurrent network that operates on word-level representations to produce context sensitive representations is trained on relatively little labeled data. In this paper, we demonstrate a general semi-supervised approach for adding pre- trained context embeddings from bidirectional language models to NLP systems and apply it to sequence labeling tasks. We evaluate our model on two standard datasets for named entity recognition (NER) and chunking, and in both cases achieve state of the art results, surpassing previous systems that use other forms of transfer or joint learning with additional labeled data and task specific gazetteers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes