CLOct 18, 2023

Code Book for the Annotation of Diverse Cross-Document Coreference of Entities in News Articles

arXiv:2310.12064v1h-index: 3

Originality Synthesis-oriented

AI Analysis

This work addresses the analysis of media bias through word-choice and labeling for researchers in computational linguistics and media studies, but it is incremental as it builds on existing annotation schemes.

The paper tackles the problem of annotating coreference across news articles by extending beyond identity relations to include near-identity and bridging relations, and provides a methodology for creating a diverse cross-document coreference corpus linked to Wikidata's knowledge graph.

This paper presents a scheme for annotating coreference across news articles, extending beyond traditional identity relations by also considering near-identity and bridging relations. It includes a precise description of how to set up Inception, a respective annotation tool, how to annotate entities in news articles, connect them with diverse coreferential relations, and link them across documents to Wikidata's global knowledge graph. This multi-layered annotation approach is discussed in the context of the problem of media bias. Our main contribution lies in providing a methodology for creating a diverse cross-document coreference corpus which can be applied to the analysis of media bias by word-choice and labelling.

View on arXiv PDF

Similar