CLNov 10, 2015

USFD: Twitter NER with Drift Compensation and Linked Data

arXiv:1511.03088v122 citations
Originality Incremental advance
AI Analysis

This addresses the problem of entity labeling in noisy social media text for NLP researchers, but it is incremental as it builds on existing methods for a specific dataset.

The paper tackled named entity recognition (NER) on Twitter by developing a system that uses structured learning with Linked Data gazetteers and unsupervised clustering features to compensate for stylistic and topic drift, achieving competitive results in the W-NUT 2015 shared task.

This paper describes a pilot NER system for Twitter, comprising the USFD system entry to the W-NUT 2015 NER shared task. The goal is to correctly label entities in a tweet dataset, using an inventory of ten types. We employ structured learning, drawing on gazetteers taken from Linked Data, and on unsupervised clustering features, and attempting to compensate for stylistic and topic drift - a key challenge in social media text. Our result is competitive; we provide an analysis of the components of our methodology, and an examination of the target dataset in the context of this task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes