CEAID: Benchmark of Multilingual Machine-Generated Text Detection Methods for Central European Languages
This addresses a gap in multilingual AI safety for Central European language communities, though it is incremental as it extends existing methods to new data.
The paper tackles the lack of machine-generated text detection benchmarks for Central European languages by creating the first such benchmark, finding that supervised fine-tuned detectors in these languages perform best and are most resistant to obfuscation.
Machine-generated text detection, as an important task, is predominantly focused on English in research. This makes the existing detectors almost unusable for non-English languages, relying purely on cross-lingual transferability. There exist only a few works focused on any of Central European languages, leaving the transferability towards these languages rather unexplored. We fill this gap by providing the first benchmark of detection methods focused on this region, while also providing comparison of train-languages combinations to identify the best performing ones. We focus on multi-domain, multi-generator, and multilingual evaluation, pinpointing the differences of individual aspects, as well as adversarial robustness of detection methods. Supervised finetuned detectors in the Central European languages are found the most performant in these languages as well as the most resistant against obfuscation.