CLMar 30

Who Wrote the Book? Detecting and Attributing LLM Ghostwriters

arXiv:2603.2805480.7h-index: 36
AI Analysis

This addresses the challenge of identifying LLM-generated content for applications like content moderation and academic integrity, representing a domain-specific advancement.

The paper tackles the problem of detecting and attributing authorship of long-form texts generated by frontier LLMs by introducing GhostWriteBench, a dataset with 50K+ word texts, and proposing TRACE, a novel fingerprinting method that achieves state-of-the-art performance and remains robust in out-of-distribution settings.

In this paper, we introduce GhostWriteBench, a dataset for LLM authorship attribution. It comprises long-form texts (50K+ words per book) generated by frontier LLMs, and is designed to test generalisation across multiple out-of-distribution (OOD) dimensions, including domain and unseen LLM author. We also propose TRACE -- a novel fingerprinting method that is interpretable and lightweight -- that works for both open- and closed-source models. TRACE creates the fingerprint by capturing token-level transition patterns (e.g., word rank) estimated by another lightweight language model. Experiments on GhostWriteBench demonstrate that TRACE achieves state-of-the-art performance, remains robust in OOD settings, and works well in limited training data scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes