CLMay 5, 2018

Chinese NER Using Lattice LSTM

arXiv:1805.02023v41170 citations
Originality Incremental advance
AI Analysis

This addresses the problem of segmentation errors in Chinese NER for NLP researchers, offering a hybrid approach that improves accuracy over existing methods.

The paper tackled Chinese Named Entity Recognition (NER) by introducing a lattice-structured LSTM model that encodes characters and potential words from a lexicon, avoiding segmentation errors and leveraging word information. Experiments showed it outperformed word-based and character-based LSTM baselines, achieving the best results on various datasets.

We investigate a lattice-structured LSTM model for Chinese NER, which encodes a sequence of input characters as well as all potential words that match a lexicon. Compared with character-based methods, our model explicitly leverages word and word sequence information. Compared with word-based methods, lattice LSTM does not suffer from segmentation errors. Gated recurrent cells allow our model to choose the most relevant characters and words from a sentence for better NER results. Experiments on various datasets show that lattice LSTM outperforms both word-based and character-based LSTM baselines, achieving the best results.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes