SEFD: Semantic-Enhanced Framework for Detecting LLM-Generated Text
This addresses the need for robust detection tools against LLM-generated text in real-world applications like online forums, though it appears incremental as it builds on existing detection methods.
The paper tackles the problem of detecting LLM-generated text that evades existing methods through paraphrasing, presenting SEFD, a semantic-enhanced framework that improves detection accuracy in paraphrasing scenarios while maintaining robustness for standard content.
The widespread adoption of large language models (LLMs) has created an urgent need for robust tools to detect LLM-generated text, especially in light of \textit{paraphrasing} techniques that often evade existing detection methods. To address this challenge, we present a novel semantic-enhanced framework for detecting LLM-generated text (SEFD) that leverages a retrieval-based mechanism to fully utilize text semantics. Our framework improves upon existing detection methods by systematically integrating retrieval-based techniques with traditional detectors, employing a carefully curated retrieval mechanism that strikes a balance between comprehensive coverage and computational efficiency. We showcase the effectiveness of our approach in sequential text scenarios common in real-world applications, such as online forums and Q\&A platforms. Through comprehensive experiments across various LLM-generated texts and detection methods, we demonstrate that our framework substantially enhances detection accuracy in paraphrasing scenarios while maintaining robustness for standard LLM-generated content.