CLApr 22, 2024

SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection

Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohammed Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, Chenxi Whitehouse, Alham Fikri Aji

arXiv:2404.14183v121.174 citationsh-index: 47SemEval

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of identifying AI-generated content for applications in security and content moderation, but it is incremental as it builds on existing detection tasks with new subtasks and data.

The paper tackled the problem of detecting machine-generated text across multiple domains, models, and languages, with results showing that the best systems for all subtasks used large language models (LLMs), attracting up to 126 participants for subtask A.

We present the results and the main findings of SemEval-2024 Task 8: Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection. The task featured three subtasks. Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine. This subtask has two tracks: a monolingual track focused solely on English texts and a multilingual track. Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM. Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine. The task attracted a large number of participants: subtask A monolingual (126), subtask A multilingual (59), subtask B (70), and subtask C (30). In this paper, we present the task, analyze the results, and discuss the system submissions and the methods they used. For all subtasks, the best systems used LLMs.

View on arXiv PDF

Similar