MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection
This work addresses the challenge of distinguishing human-written from machine-generated text, which is incremental as it applies existing models to a new competition task.
The paper tackled the problem of detecting machine-generated text across multiple tracks in SemEval-2024 Task 8, achieving competitive results using ensemble transformer models and other methods like FLAN-T5 fine-tuning.
This paper presents the MasonTigers entry to the SemEval-2024 Task 8 - Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection. The task encompasses Binary Human-Written vs. Machine-Generated Text Classification (Track A), Multi-Way Machine-Generated Text Classification (Track B), and Human-Machine Mixed Text Detection (Track C). Our best performing approaches utilize mainly the ensemble of discriminator transformer models along with sentence transformer and statistical machine learning approaches in specific cases. Moreover, zero-shot prompting and fine-tuning of FLAN-T5 are used for Track A and B.