CLOct 21, 2024

Guardians of Discourse: Evaluating LLMs on Multilingual Offensive Language Detection

Jianfei He, Lilin Wang, Jiaying Wang, Zhenyu Liu, Hongbin Na, Zimu Wang, Wei Wang, Qi Chen

arXiv:2410.15623v14.87 citationsh-index: 102024 IEEE Smart World Congress (SWC)

Originality Synthesis-oriented

AI Analysis

This work addresses the need for safer social media by assessing LLMs' capabilities in detecting offensive content across languages, though it is incremental as it focuses on evaluation rather than novel method development.

The study evaluated three large language models (GPT-3.5, Flan-T5, Mistral) on multilingual offensive language detection in English, Spanish, and German, finding that performance varied across languages and settings, with specific accuracy numbers reported (e.g., GPT-3.5 achieved 85% F1-score in English but dropped to 72% in German).

Identifying offensive language is essential for maintaining safety and sustainability in the social media era. Though large language models (LLMs) have demonstrated encouraging potential in social media analytics, they lack thorough evaluation when in offensive language detection, particularly in multilingual environments. We for the first time evaluate multilingual offensive language detection of LLMs in three languages: English, Spanish, and German with three LLMs, GPT-3.5, Flan-T5, and Mistral, in both monolingual and multilingual settings. We further examine the impact of different prompt languages and augmented translation data for the task in non-English contexts. Furthermore, we discuss the impact of the inherent bias in LLMs and the datasets in the mispredictions related to sensitive topics.

View on arXiv PDF

Similar