CLJun 17, 2024

Evaluating LLMs for Quotation Attribution in Literary Texts: A Case Study of LLaMa3

Gaspard Michel, Elena V. Epure, Romain Hennequin, Christophe Cerisara

arXiv:2406.11380v311.515 citationsHas Code

Originality Incremental advance

AI Analysis

This establishes a new state-of-the-art for quotation attribution in English literature, addressing a specific literary analysis task.

The researchers evaluated Llama-3's ability to attribute direct-speech utterances to speakers in novels, finding it surpassed ChatGPT and encoder-based baselines by a large margin on a corpus of 28 novels, with memorization not explaining the performance gain.

Large Language Models (LLMs) have shown promising results in a variety of literary tasks, often using complex memorized details of narration and fictional characters. In this work, we evaluate the ability of Llama-3 at attributing utterances of direct-speech to their speaker in novels. The LLM shows impressive results on a corpus of 28 novels, surpassing published results with ChatGPT and encoder-based baselines by a large margin. We then validate these results by assessing the impact of book memorization and annotation contamination. We found that these types of memorization do not explain the large performance gain, making Llama-3 the new state-of-the-art for quotation attribution in English literature. We release publicly our code and data.

View on arXiv PDF Code

Similar