CLJun 18, 2024

Applying Ensemble Methods to Model-Agnostic Machine-Generated Text Detection

arXiv:2406.12570v12 citations
Originality Incremental advance
AI Analysis

This work addresses the need for model-agnostic detection of machine-generated text, which is incremental as it builds on existing DetectGPT methods.

The paper tackled the problem of detecting machine-generated text when the source large language model is unknown by applying ensemble methods to DetectGPT classifiers, achieving an AUROC of 0.73 with zero-shot methods and 0.94 with supervised learning.

In this paper, we study the problem of detecting machine-generated text when the large language model (LLM) it is possibly derived from is unknown. We do so by apply ensembling methods to the outputs from DetectGPT classifiers (Mitchell et al. 2023), a zero-shot model for machine-generated text detection which is highly accurate when the generative (or base) language model is the same as the discriminative (or scoring) language model. We find that simple summary statistics of DetectGPT sub-model outputs yield an AUROC of 0.73 (relative to 0.61) while retaining its zero-shot nature, and that supervised learning methods sharply boost the accuracy to an AUROC of 0.94 but require a training dataset. This suggests the possibility of further generalisation to create a highly-accurate, model-agnostic machine-generated text detector.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes