CLAIFeb 21, 2024

Technical Report on the Pangram AI-Generated Text Classifier

arXiv:2402.14873v326 citationsh-index: 3
AI Analysis

This work addresses the need for reliable AI text detection across various domains, offering a significant improvement over current tools, though it is incremental in advancing detection accuracy.

The researchers tackled the problem of distinguishing AI-generated text from human-written text by developing Pangram Text, a transformer-based classifier that achieved over 38 times lower error rates compared to existing methods on a comprehensive benchmark across 10 text domains and 8 language models.

We present Pangram Text, a transformer-based neural network trained to distinguish text written by large language models from text written by humans. Pangram Text outperforms zero-shot methods such as DetectGPT as well as leading commercial AI detection tools with over 38 times lower error rates on a comprehensive benchmark comprised of 10 text domains (student writing, creative writing, scientific writing, books, encyclopedias, news, email, scientific papers, short-form Q&A) and 8 open- and closed-source large language models. We propose a training algorithm, hard negative mining with synthetic mirrors, that enables our classifier to achieve orders of magnitude lower false positive rates on high-data domains such as reviews. Finally, we show that Pangram Text is not biased against nonnative English speakers and generalizes to domains and models unseen during training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes