CLAIMay 17, 2024

DeepPavlov at SemEval-2024 Task 8: Leveraging Transfer Learning for Detecting Boundaries of Machine-Generated Texts

arXiv:2405.10629v128 citationsh-index: 6SemEval
Originality Incremental advance
AI Analysis

This addresses the nuanced detection of machine-generated text boundaries for applications in collaborative human-AI writing, though it appears incremental as it builds on existing transfer learning methods.

The paper tackled the problem of detecting boundaries between human-written and machine-generated texts in hybrid human-AI writing, presenting a pipeline for data augmentation and fine-tuning DeBERTaV3 that achieved a new best MAE score on the SemEval-2024 competition leaderboard.

The Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection shared task in the SemEval-2024 competition aims to tackle the problem of misusing collaborative human-AI writing. Although there are a lot of existing detectors of AI content, they are often designed to give a binary answer and thus may not be suitable for more nuanced problem of finding the boundaries between human-written and machine-generated texts, while hybrid human-AI writing becomes more and more popular. In this paper, we address the boundary detection problem. Particularly, we present a pipeline for augmenting data for supervised fine-tuning of DeBERTaV3. We receive new best MAE score, according to the leaderboard of the competition, with this pipeline.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes