CLJan 27, 2024

To Burst or Not to Burst: Generating and Quantifying Improbable Text

Kuleen Sasse, Samuel Barham, Efsun Sarioglu Kayi, Edward W. Staley

arXiv:2401.15476v125.294 citationsh-index: 3Has CodeGEM

Originality Incremental advance

AI Analysis

This work addresses the challenge of detecting AI-generated text, which is crucial for applications like content moderation and authenticity verification, but it is incremental as it builds on existing metrics and sampling methods.

The paper tackled the problem of distinguishing LLM-generated text from human-authored text by evaluating multiple metrics, sampling techniques, and models, introducing a new metric (recoverability) and sampling method (burst sampling) to reduce the gap, with results showing recoverability as the best separator for LLaMA and burst sampling improving distributional closeness for Vicuna.

While large language models (LLMs) are extremely capable at text generation, their outputs are still distinguishable from human-authored text. We explore this separation across many metrics over text, many sampling techniques, many types of text data, and across two popular LLMs, LLaMA and Vicuna. Along the way, we introduce a new metric, recoverability, to highlight differences between human and machine text; and we propose a new sampling technique, burst sampling, designed to close this gap. We find that LLaMA and Vicuna have distinct distributions under many of the metrics, and that this influences our results: Recoverability separates real from fake text better than any other metric when using LLaMA. When using Vicuna, burst sampling produces text which is distributionally closer to real text compared to other sampling techniques.

View on arXiv PDF Code

Similar