LLM_annotate: A Python package for annotating and analyzing fiction characters
This provides a tool for researchers in computational literary analysis or digital humanities to conduct efficient and reproducible character analyses, though it is incremental as it packages existing methods.
The authors developed LLM_annotate, a Python package that standardizes workflows for analyzing fiction characters using large language models, enabling annotation of character behaviors, inference of traits, and validation via a human-in-the-loop GUI, as demonstrated through tutorial examples with The Simpsons Movie and Pride and Prejudice.
LLM_annotate is a Python package for analyzing the personality of fiction characters with large language models. It standardizes workflows for annotating character behaviors in full texts (e.g., books and movie scripts), inferring character traits, and validating annotation/inference quality via a human-in-the-loop GUI. The package includes functions for text chunking, LLM-based annotation, character name disambiguation, quality scoring, and computation of character-level statistics and embeddings. Researchers can use any LLM, commercial, open-source, or custom, within LLM_annotate. Through tutorial examples using The Simpsons Movie and the novel Pride and Prejudice, I demonstrate the usage of the package for efficient and reproducible character analyses.