CLSep 11, 2021

XCoref: Cross-document Coreference Resolution in the Wild

arXiv:2109.05252v17 citations
Originality Incremental advance
AI Analysis

This addresses bias detection in news for readers and researchers, though it is incremental as it builds on existing CDCR methods by extending them to more complex relations.

The paper tackles the problem of cross-document coreference resolution (CDCR) for abstract or loose relations in political news, which can expose bias by word choice, and proposes XCoref, an unsupervised method that outperforms state-of-the-art models in resolving such complex mentions.

Datasets and methods for cross-document coreference resolution (CDCR) focus on events or entities with strict coreference relations. They lack, however, annotating and resolving coreference mentions with more abstract or loose relations that may occur when news articles report about controversial and polarized events. Bridging and loose coreference relations trigger associations that may lead to exposing news readers to bias by word choice and labeling. For example, coreferential mentions of "direct talks between U.S. President Donald Trump and Kim" such as "an extraordinary meeting following months of heated rhetoric" or "great chance to solve a world problem" form a more positive perception of this event. A step towards bringing awareness of bias by word choice and labeling is the reliable resolution of coreferences with high lexical diversity. We propose an unsupervised method named XCoref, which is a CDCR method that capably resolves not only previously prevalent entities, such as persons, e.g., "Donald Trump," but also abstractly defined concepts, such as groups of persons, "caravan of immigrants," events and actions, e.g., "marching to the U.S. border." In an extensive evaluation, we compare the proposed XCoref to a state-of-the-art CDCR method and a previous method TCA that resolves such complex coreference relations and find that XCoref outperforms these methods. Outperforming an established CDCR model shows that the new CDCR models need to be evaluated on semantically complex mentions with more loose coreference relations to indicate their applicability of models to resolve mentions in the "wild" of political news articles.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes