Identifying Speakers and Addressees of Quotations in Novels with Prompt Learning
This work addresses a domain-specific problem in literary analysis by enabling more comprehensive character relationship construction in novels, though it is incremental as it builds on existing quotation attribution research.
The paper tackles the problem of identifying both speakers and addressees of quotations in novels, addressing dataset scarcity by annotating the first Chinese quotation corpus with these elements, and proposes prompt learning-based methods that outperform zero-shot and few-shot large language models on Chinese and English datasets.
Quotations in literary works, especially novels, are important to create characters, reflect character relationships, and drive plot development. Current research on quotation extraction in novels primarily focuses on quotation attribution, i.e., identifying the speaker of the quotation. However, the addressee of the quotation is also important to construct the relationship between the speaker and the addressee. To tackle the problem of dataset scarcity, we annotate the first Chinese quotation corpus with elements including speaker, addressee, speaking mode and linguistic cue. We propose prompt learning-based methods for speaker and addressee identification based on fine-tuned pre-trained models. Experiments on both Chinese and English datasets show the effectiveness of the proposed methods, which outperform methods based on zero-shot and few-shot large language models.