CLAIHCLGJun 10, 2019

GLTR: Statistical Detection and Visualization of Generated Text

arXiv:1906.04043v11210 citationsHas Code
Originality Incremental advance
AI Analysis

This addresses the risk of abuse from language models by providing a simple detection method for non-experts, though it is incremental as it builds on baseline statistical approaches.

The paper tackles the problem of detecting AI-generated text by developing GLTR, a tool that uses statistical methods to identify generation artifacts, which improved human detection rates from 54% to 72% in a study.

The rapid improvement of language models has raised the specter of abuse of text generation systems. This progress motivates the development of simple methods for detecting generated text that can be used by and explained to non-experts. We develop GLTR, a tool to support humans in detecting whether a text was generated by a model. GLTR applies a suite of baseline statistical methods that can detect generation artifacts across common sampling schemes. In a human-subjects study, we show that the annotation scheme provided by GLTR improves the human detection-rate of fake text from 54% to 72% without any prior training. GLTR is open-source and publicly deployed, and has already been widely used to detect generated outputs

Code Implementations6 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes