SENTRA: Selected-Next-Token Transformer for LLM Text Detection
This addresses the growing misuse of LLMs by providing a general-purpose detector for undeclared AI-generated text, though it appears incremental as it builds on existing Transformer and contrastive learning methods.
The paper tackled the problem of detecting LLM-generated text without explicit declaration by introducing SENTRA, a Transformer-based encoder that uses selected-next-token-probability sequences and contrastive pre-training, and demonstrated it significantly outperforms baselines in out-of-domain settings across 24 text domains.
LLMs are becoming increasingly capable and widespread. Consequently, the potential and reality of their misuse is also growing. In this work, we address the problem of detecting LLM-generated text that is not explicitly declared as such. We present a novel, general-purpose, and supervised LLM text detector, SElected-Next-Token tRAnsformer (SENTRA). SENTRA is a Transformer-based encoder leveraging selected-next-token-probability sequences and utilizing contrastive pre-training on large amounts of unlabeled data. Our experiments on three popular public datasets across 24 domains of text demonstrate SENTRA is a general-purpose classifier that significantly outperforms popular baselines in the out-of-domain setting.