Computational Representations of Character Significance in Novels
This work addresses literary scholars by providing a novel computational framework for analyzing character significance in novels, though it is incremental as it builds on existing literary theory and computational methods.
The authors tackled the problem of modeling character significance in novels by introducing a six-component structural model from literary theory, which includes discussion by other characters, and applied it to 19th-century British realist novels using LLMs and transformers, resulting in new computational representations for analyzing character centrality and gendered dynamics.
Characters in novels have typically been modeled based on their presence in scenes in narrative, considering aspects like their actions, named mentions, and dialogue. This conception of character places significant emphasis on the main character who is present in the most scenes. In this work, we instead adopt a framing developed from a new literary theory proposing a six-component structural model of character. This model enables a comprehensive approach to character that accounts for the narrator-character distinction and includes a component neglected by prior methods, discussion by other characters. We compare general-purpose LLMs with task-specific transformers for operationalizing this model of character on major 19th-century British realist novels. Our methods yield both component-level and graph representations of character discussion. We then demonstrate that these representations allow us to approach literary questions at scale from a new computational lens. Specifically, we explore Woloch's classic "the one vs the many" theory of character centrality and the gendered dynamics of character discussion.