CL AI IRNov 17, 2024

SEFD: Semantic-Enhanced Framework for Detecting LLM-Generated Text

Weiqing He, Bojian Hou, Tianqi Shang, Davoud Ataee Tarzanagh, Qi Long, Li Shen

arXiv:2411.12764v14.25 citationsh-index: 6Has CodeBigData

Originality Incremental advance

AI Analysis

This addresses the need for robust detection tools against LLM-generated text in real-world applications like online forums, though it appears incremental as it builds on existing detection methods.

The paper tackles the problem of detecting LLM-generated text that evades existing methods through paraphrasing, presenting SEFD, a semantic-enhanced framework that improves detection accuracy in paraphrasing scenarios while maintaining robustness for standard content.

The widespread adoption of large language models (LLMs) has created an urgent need for robust tools to detect LLM-generated text, especially in light of \textit{paraphrasing} techniques that often evade existing detection methods. To address this challenge, we present a novel semantic-enhanced framework for detecting LLM-generated text (SEFD) that leverages a retrieval-based mechanism to fully utilize text semantics. Our framework improves upon existing detection methods by systematically integrating retrieval-based techniques with traditional detectors, employing a carefully curated retrieval mechanism that strikes a balance between comprehensive coverage and computational efficiency. We showcase the effectiveness of our approach in sequential text scenarios common in real-world applications, such as online forums and Q\&A platforms. Through comprehensive experiments across various LLM-generated texts and detection methods, we demonstrate that our framework substantially enhances detection accuracy in paraphrasing scenarios while maintaining robustness for standard LLM-generated content.

View on arXiv PDF Code

Similar