CLJun 6, 2019

GCDT: A Global Context Enhanced Deep Transition Architecture for Sequence Labeling

arXiv:1906.02437v11130 citations
AI Analysis

This work addresses performance bottlenecks in sequence labeling tasks like named entity recognition and chunking for NLP applications, representing an incremental improvement over existing methods.

The paper tackles the limitations of RNNs in sequence labeling by proposing GCDT, a global context enhanced deep transition architecture, which achieves 91.96 F1 on CoNLL03 NER and 95.43 F1 on CoNLL2000 Chunking with GloVe embeddings, and sets new SOTA results of 93.47 F1 and 97.30 F1 respectively when using BERT.

Current state-of-the-art systems for sequence labeling are typically based on the family of Recurrent Neural Networks (RNNs). However, the shallow connections between consecutive hidden states of RNNs and insufficient modeling of global information restrict the potential performance of those models. In this paper, we try to address these issues, and thus propose a Global Context enhanced Deep Transition architecture for sequence labeling named GCDT. We deepen the state transition path at each position in a sentence, and further assign every token with a global representation learned from the entire sentence. Experiments on two standard sequence labeling tasks show that, given only training data and the ubiquitous word embeddings (Glove), our GCDT achieves 91.96 F1 on the CoNLL03 NER task and 95.43 F1 on the CoNLL2000 Chunking task, which outperforms the best reported results under the same settings. Furthermore, by leveraging BERT as an additional resource, we establish new state-of-the-art results with 93.47 F1 on NER and 97.30 F1 on Chunking.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes