CLJul 2, 2021

Concept Identification of Directly and Indirectly Related Mentions Referring to Groups of Persons

arXiv:2107.00955v12 citations
Originality Synthesis-oriented
AI Analysis

This addresses a domain-specific problem in natural language processing for tasks like text dimension reduction and named entity resolution, but it appears incremental as it builds on existing clustering approaches.

The paper tackles unsupervised concept identification for groups of persons as actors in texts, achieving results that separate geopolitical entities and cluster related mentions with diverse wording.

Unsupervised concept identification through clustering, i.e., identification of semantically related words and phrases, is a common approach to identify contextual primitives employed in various use cases, e.g., text dimension reduction, i.e., replace words with the concepts to reduce the vocabulary size, summarization, and named entity resolution. We demonstrate the first results of an unsupervised approach for the identification of groups of persons as actors extracted from a set of related articles. Specifically, the approach clusters mentions of groups of persons that act as non-named entity actors in the texts, e.g., "migrant families" = "asylum-seekers." Compared to our baseline, the approach keeps the mentions of the geopolitical entities separated, e.g., "Iran leaders" != "European leaders," and clusters (in)directly related mentions with diverse wording, e.g., "American officials" = "Trump Administration."

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes