CL AI LGJun 13, 2023

Beyond Black Box AI-Generated Plagiarism Detection: From Sentence to Document Level

Mujahid Ali Quidwai, Chunhui Li, Parijat Dube

arXiv:2306.08122v14.325 citationsh-index: 94

Originality Incremental advance

AI Analysis

This addresses the issue of plagiarism detection for educators and institutions, offering a more transparent and adaptable solution, though it is incremental as it builds on existing AI text classification methods.

The paper tackles the problem of detecting AI-generated plagiarism in academic writing by proposing a novel NLP-based method that achieves up to 94% accuracy in classifying human and AI text, providing quantifiable metrics at sentence and document levels for easier human interpretation.

The increasing reliance on large language models (LLMs) in academic writing has led to a rise in plagiarism. Existing AI-generated text classifiers have limited accuracy and often produce false positives. We propose a novel approach using natural language processing (NLP) techniques, offering quantifiable metrics at both sentence and document levels for easier interpretation by human evaluators. Our method employs a multi-faceted approach, generating multiple paraphrased versions of a given question and inputting them into the LLM to generate answers. By using a contrastive loss function based on cosine similarity, we match generated sentences with those from the student's response. Our approach achieves up to 94% accuracy in classifying human and AI text, providing a robust and adaptable solution for plagiarism detection in academic settings. This method improves with LLM advancements, reducing the need for new model training or reconfiguration, and offers a more transparent way of evaluating and detecting AI-generated text.

View on arXiv PDF

Similar