CLDec 16, 2024

The Role of Natural Language Processing Tasks in Automatic Literary Character Network Construction

arXiv:2412.11560v122 citationsh-index: 3
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of optimizing character network construction for literary analysis, but it is incremental as it focuses on evaluating existing NLP tasks rather than introducing new methods.

The study investigated how low-level NLP tasks like named entity recognition (NER) and coreference resolution affect the performance of automatic character network extraction from literary texts, finding that NER strongly influences character detection and coreference resolution is essential to avoid missing co-occurrences, with traditional NLP pipelines outperforming LLM-based methods in recall.

The automatic extraction of character networks from literary texts is generally carried out using natural language processing (NLP) cascading pipelines. While this approach is widespread, no study exists on the impact of low-level NLP tasks on their performance. In this article, we conduct such a study on a literary dataset, focusing on the role of named entity recognition (NER) and coreference resolution when extracting co-occurrence networks. To highlight the impact of these tasks' performance, we start with gold-standard annotations, progressively add uniformly distributed errors, and observe their impact in terms of character network quality. We demonstrate that NER performance depends on the tested novel and strongly affects character detection. We also show that NER-detected mentions alone miss a lot of character co-occurrences, and that coreference resolution is needed to prevent this. Finally, we present comparison points with 2 methods based on large language models (LLMs), including a fully end-to-end one, and show that these models are outperformed by traditional NLP pipelines in terms of recall.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes