CL LG NENov 14, 2016

Attending to Characters in Neural Sequence Labeling Models

Marek Rei, Gamal K. O. Crichton, Sampo Pyysalo

arXiv:1611.04361v118.0193 citations

Originality Incremental advance

AI Analysis

This addresses a bottleneck in natural language processing for tasks like named entity recognition, but it is incremental as it builds on existing sequence labeling architectures.

The paper tackled the problem of handling unseen or rare words in neural sequence labeling models by proposing a character-level extension with an attention mechanism, which improved performance on all benchmarks and achieved the best results with fewer parameters.

Sequence labeling architectures use word embeddings for capturing similarity, but suffer when handling previously unseen or rare words. We investigate character-level extensions to such models and propose a novel architecture for combining alternative word representations. By using an attention mechanism, the model is able to dynamically decide how much information to use from a word- or character-level component. We evaluated different architectures on a range of sequence labeling datasets, and character-level extensions were found to improve performance on every benchmark. In addition, the proposed attention-based architecture delivered the best results even with a smaller number of trainable parameters.

View on arXiv PDF

Similar