CLDLFeb 3, 2017

Named Entity Evolution Recognition on the Blogosphere

arXiv:1702.01187v13 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of language evolution in web archives for users needing to find and interpret content, but it is incremental as it adapts an existing method to a new domain.

The paper tackled the problem of recognizing evolving named entities in noisy web and blogosphere data, where existing methods from newspaper collections fail, by adapting an existing method with novel filters and Semantic Web resources, showing potential in evaluation.

Advancements in technology and culture lead to changes in our language. These changes create a gap between the language known by users and the language stored in digital archives. It affects user's possibility to firstly find content and secondly interpret that content. In previous work we introduced our approach for Named Entity Evolution Recognition~(NEER) in newspaper collections. Lately, increasing efforts in Web preservation lead to increased availability of Web archives covering longer time spans. However, language on the Web is more dynamic than in traditional media and many of the basic assumptions from the newspaper domain do not hold for Web data. In this paper we discuss the limitations of existing methodology for NEER. We approach these by adapting an existing NEER method to work on noisy data like the Web and the Blogosphere in particular. We develop novel filters that reduce the noise and make use of Semantic Web resources to obtain more information about terms. Our evaluation shows the potentials of the proposed approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes