CLNov 26, 2023

Machine-Generated Text Detection using Deep Learning

arXiv:2311.15425v13 citationsh-index: 2
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of text authenticity detection for applications like content moderation, but it is incremental as it builds on existing detection methods.

The research tackled the problem of distinguishing machine-generated text from human-written text using deep learning, achieving detection efficacy across multiple datasets including Twitter Sentiment and SQuAD, with results heavily dependent on sentence sequence length.

Our research focuses on the crucial challenge of discerning text produced by Large Language Models (LLMs) from human-generated text, which holds significance for various applications. With ongoing discussions about attaining a model with such functionality, we present supporting evidence regarding the feasibility of such models. We evaluated our models on multiple datasets, including Twitter Sentiment, Football Commentary, Project Gutenberg, PubMedQA, and SQuAD, confirming the efficacy of the enhanced detection approaches. These datasets were sampled with intricate constraints encompassing every possibility, laying the foundation for future research. We evaluate GPT-3.5-Turbo against various detectors such as SVM, RoBERTa-base, and RoBERTa-large. Based on the research findings, the results predominantly relied on the sequence length of the sentence.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes