CLLGJun 20, 2020

Named Entity Extraction with Finite State Transducers

arXiv:2006.11548v1
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for NLP tasks, offering a simple, language-agnostic approach to named entity extraction.

The paper tackles named entity tagging by constructing a series of automatons using supervised learning, achieving an F1 score of 60% on the Spanish CoNLL-2002 dataset.

We describe a named entity tagging system that requires minimal linguistic knowledge and can be applied to more target languages without substantial changes. The system is based on the ideas of the Brill's tagger which makes it really simple. Using supervised machine learning, we construct a series of automatons (or transducers) in order to tag a given text. The final model is composed entirely of automatons and it requires a lineal time for tagging. It was tested with the Spanish data set provided in the CoNLL-$2002$ attaining an overall $F_{β= 1}$ measure of $60\%.$ Also, we present an algorithm for the construction of the final transducer used to encode all the learned contextual rules.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes