CLNov 18, 2024

Advacheck at GenAI Detection Task 1: AI Detection Powered by Domain-Aware Multi-Tasking

arXiv:2411.11736v120 citationsh-index: 6Has Code
Originality Incremental advance
AI Analysis

This work addresses the challenge of AI-generated text detection for competition settings, representing an incremental improvement over existing methods.

The paper tackled the problem of distinguishing machine-generated from human-written texts in a monolingual competition, achieving first place with an 83.07% macro F1-score, which was 10% above the baseline.

The paper describes a system designed by Advacheck team to recognise machine-generated and human-written texts in the monolingual subtask of GenAI Detection Task 1 competition. Our developed system is a multi-task architecture with shared Transformer Encoder between several classification heads. One head is responsible for binary classification between human-written and machine-generated texts, while the other heads are auxiliary multiclass classifiers for texts of different domains from particular datasets. As multiclass heads were trained to distinguish the domains presented in the data, they provide a better understanding of the samples. This approach led us to achieve the first place in the official ranking with 83.07% macro F1-score on the test set and bypass the baseline by 10%. We further study obtained system through ablation, error and representation analyses, finding that multi-task learning outperforms single-task mode and simultaneous tasks form a cluster structure in embeddings space.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes