CLFeb 24, 2024

SemEval-2024 Task 8: Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection

arXiv:2402.15873v22 citationsh-index: 2SemEval
Originality Synthesis-oriented
AI Analysis

This addresses the need for detection methods as large language models become more prevalent, but it appears incremental as it builds on existing RoBERTa techniques.

The authors tackled the problem of detecting machine-generated text by using weighted averages of RoBERTa layers to capture relevant information, achieving results in SemEval-2024 Task 8 for monolingual and multilingual detection.

This document contains the details of the authors' submission to the proceedings of SemEval 2024's Task 8: Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection Subtask A (monolingual) and B. Detection of machine-generated text is becoming an increasingly important task, with the advent of large language models (LLMs). In this paper, we lay out how using weighted averages of RoBERTa layers lets us capture information about text that is relevant to machine-generated text detection.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes