CLLGNENov 14, 2016

Attending to Characters in Neural Sequence Labeling Models

arXiv:1611.04361v1193 citations
Originality Incremental advance
AI Analysis

This addresses a bottleneck in natural language processing for tasks like named entity recognition, but it is incremental as it builds on existing sequence labeling architectures.

The paper tackled the problem of handling unseen or rare words in neural sequence labeling models by proposing a character-level extension with an attention mechanism, which improved performance on all benchmarks and achieved the best results with fewer parameters.

Sequence labeling architectures use word embeddings for capturing similarity, but suffer when handling previously unseen or rare words. We investigate character-level extensions to such models and propose a novel architecture for combining alternative word representations. By using an attention mechanism, the model is able to dynamically decide how much information to use from a word- or character-level component. We evaluated different architectures on a range of sequence labeling datasets, and character-level extensions were found to improve performance on every benchmark. In addition, the proposed attention-based architecture delivered the best results even with a smaller number of trainable parameters.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes