CLJan 30, 2024

Distinguishing Fictional Voices: a Study of Authorship Verification Models for Quotation Attribution

arXiv:2401.16968v1103 citationsh-index: 4LATECHCLFL
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of improving automated speaker detection in literary texts for researchers in computational linguistics, but it is incremental as it shows mixed results and calls for further investigation.

The study tackled the problem of attributing quotes to characters in novels by exploring stylistic representations from authorship verification models, finding that while these models can distinguish characters, they do not consistently outperform semantic-only models in quote attribution.

Recent approaches to automatically detect the speaker of an utterance of direct speech often disregard general information about characters in favor of local information found in the context, such as surrounding mentions of entities. In this work, we explore stylistic representations of characters built by encoding their quotes with off-the-shelf pretrained Authorship Verification models in a large corpus of English novels (the Project Dialogism Novel Corpus). Results suggest that the combination of stylistic and topical information captured in some of these models accurately distinguish characters among each other, but does not necessarily improve over semantic-only models when attributing quotes. However, these results vary across novels and more investigation of stylometric models particularly tailored for literary texts and the study of characters should be conducted.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes