CL DLOct 24, 2025

A Stylometric Application of Large Language Models

Harrison F. Stropkay, Jiayi Chen, Mohammad J. Latifi, Daniel N. Rockmore, Jeremy R. Manning

arXiv:2510.21958v1h-index: 1

Originality Synthesis-oriented

AI Analysis

This provides a stylometric tool for literary analysis and authorship verification, though it is incremental as it applies existing LLMs to a new task.

The authors tackled the problem of authorship attribution by training individual GPT-2 models on specific authors' works, showing that these models predict held-out text from the same author more accurately than from others, with a demonstration on eight known authors and confirmation of R. P. Thompson's authorship in the Oz series.

We show that large language models (LLMs) can be used to distinguish the writings of different authors. Specifically, an individual GPT-2 model, trained from scratch on the works of one author, will predict held-out text from that author more accurately than held-out text from other authors. We suggest that, in this way, a model trained on one author's works embodies the unique writing style of that author. We first demonstrate our approach on books written by eight different (known) authors. We also use this approach to confirm R. P. Thompson's authorship of the well-studied 15th book of the Oz series, originally attributed to F. L. Baum.

View on arXiv PDF

Similar