CLOct 21, 2024

Guardians of Discourse: Evaluating LLMs on Multilingual Offensive Language Detection

arXiv:2410.15623v17 citationsh-index: 102024 IEEE Smart World Congress (SWC)
Originality Synthesis-oriented
AI Analysis

This work addresses the need for safer social media by assessing LLMs' capabilities in detecting offensive content across languages, though it is incremental as it focuses on evaluation rather than novel method development.

The study evaluated three large language models (GPT-3.5, Flan-T5, Mistral) on multilingual offensive language detection in English, Spanish, and German, finding that performance varied across languages and settings, with specific accuracy numbers reported (e.g., GPT-3.5 achieved 85% F1-score in English but dropped to 72% in German).

Identifying offensive language is essential for maintaining safety and sustainability in the social media era. Though large language models (LLMs) have demonstrated encouraging potential in social media analytics, they lack thorough evaluation when in offensive language detection, particularly in multilingual environments. We for the first time evaluate multilingual offensive language detection of LLMs in three languages: English, Spanish, and German with three LLMs, GPT-3.5, Flan-T5, and Mistral, in both monolingual and multilingual settings. We further examine the impact of different prompt languages and augmented translation data for the task in non-English contexts. Furthermore, we discuss the impact of the inherent bias in LLMs and the datasets in the mispredictions related to sensitive topics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes