LLM one-shot style transfer for Authorship Attribution and Verification
This addresses authorship analysis for forensic and literary applications by reducing spurious correlations, though it is incremental in applying LLMs to a known bottleneck.
The authors tackled the problem of authorship attribution and verification by leveraging LLM pre-training and in-context learning to measure style transferability, achieving higher accuracy than comparable LLM prompting and contrastive baselines while controlling for topical correlations.
Computational stylometry analyzes writing style through quantitative patterns in text, supporting applications from forensic tasks such as identity linking and plagiarism detection to literary attribution in the humanities. Supervised and contrastive approaches rely on data with spurious correlations and often confuse style with topic. Despite their natural use in AI-generated text detection, the CLM pre-training of modern LLMs has been scarcely leveraged for general authorship problems. We propose a novel unsupervised approach based on this extensive pre-training and the in-context learning capabilities of LLMs, employing the log-probabilities of an LLM to measure style transferability from one text to another. Our method significantly outperforms LLM prompting approaches of comparable scale and achieves higher accuracy than contrastively trained baselines when controlling for topical correlations. Moreover, performance scales fairly consistently with the size of the base model and, in the case of authorship verification, with an additional mechanism that increases test-time computation; enabling flexible trade-offs between computational cost and accuracy.