SPADE: Structured Prompting Augmentation for Dialogue Enhancement in Machine-Generated Text Detection
This work addresses security concerns in LLM applications by enhancing detection models for synthetic dialogues, though it is incremental as it builds on existing methods with new datasets.
The paper tackles the challenge of detecting machine-generated text in dialogues by addressing the lack of high-quality synthetic datasets, proposing SPADE to create 14 new datasets and showing improved generalization performance with mixed datasets.
The increasing capability of large language models (LLMs) to generate synthetic content has heightened concerns about their misuse, driving the development of Machine-Generated Text (MGT) detection models. However, these detectors face significant challenges due to the lack of high-quality synthetic datasets for training. To address this issue, we propose SPADE, a structured framework for detecting synthetic dialogues using prompt-based positive and negative samples. Our proposed methods yield 14 new dialogue datasets, which we benchmark against eight MGT detection models. The results demonstrate improved generalization performance when utilizing a mixed dataset produced by proposed augmentation frameworks, offering a practical approach to enhancing LLM application security. Considering that real-world agents lack knowledge of future opponent utterances, we simulate online dialogue detection and examine the relationship between chat history length and detection accuracy. Our open-source datasets, code and prompts can be downloaded from https://github.com/AngieYYF/SPADE-customer-service-dialogue.