CLAug 9, 2015

Bidirectional LSTM-CRF Models for Sequence Tagging

arXiv:1508.01991v14412 citations
Originality Incremental advance
AI Analysis

This work addresses sequence tagging problems for NLP researchers and practitioners, offering an incremental improvement by combining existing methods in a novel way.

The authors tackled sequence tagging in NLP by proposing a bidirectional LSTM-CRF model, which achieved state-of-the-art or close accuracy on POS, chunking, and NER datasets with improved robustness and reduced dependence on word embeddings.

In this paper, we propose a variety of Long Short-Term Memory (LSTM) based models for sequence tagging. These models include LSTM networks, bidirectional LSTM (BI-LSTM) networks, LSTM with a Conditional Random Field (CRF) layer (LSTM-CRF) and bidirectional LSTM with a CRF layer (BI-LSTM-CRF). Our work is the first to apply a bidirectional LSTM CRF (denoted as BI-LSTM-CRF) model to NLP benchmark sequence tagging data sets. We show that the BI-LSTM-CRF model can efficiently use both past and future input features thanks to a bidirectional LSTM component. It can also use sentence level tag information thanks to a CRF layer. The BI-LSTM-CRF model can produce state of the art (or close to) accuracy on POS, chunking and NER data sets. In addition, it is robust and has less dependence on word embedding as compared to previous observations.

Code Implementations25 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes