MEAIAPMay 29

A Distribution-Free Framework for Rewrite-Based Human-text Detection via Knockoff Filtering

arXiv:2606.0040267.3
Predicted impact top 4% in ME · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners needing reliable detection of LLM-generated text, this work provides a way to add statistical guarantees to existing rewrite-based detectors.

The paper introduces a distribution-free framework that converts rewrite-based LLM-generated text detectors into ones with finite-sample false discovery rate (FDR) guarantees, without retraining. It demonstrates reliable FDR control with meaningful detection power across three detection models, 19 domains, and four LLMs.

We propose a distribution-free statistical framework that converts arbitrary rewrite-based detectors into detectors with finite-sample FDR guarantees without retraining. Our key observation is that rewrite-based detection implicitly constructs knockoff samples, enabling LLM-generated text detection to be formulated as a multiple hypothesis testing problem with knockoff structure. This perspective separates the design of detection statistics from the control of false discoveries, allowing existing rewrite detectors to inherit finite-sample false discovery rate (FDR) guarantees through a simple calibration procedure. We demonstrate reliable FDR control with meaningful detection power across three detection models, 19 domains, and four LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes