CLNov 17, 2024

Dialectal Toxicity Detection: Evaluating LLM-as-a-Judge Consistency Across Language Varieties

arXiv:2411.10954v16 citationsh-index: 13Has CodeEMNLP
Originality Incremental advance
AI Analysis

This addresses the issue of bias and inconsistency in AI toxicity detection for diverse language users, but it is incremental as it builds on existing LLM-as-a-judge methods.

The paper tackled the problem of how dialectal differences affect toxicity detection by large language models (LLMs), finding that LLMs are sensitive to multilingual and dialectal variations but show weakest consistency in LLM-human agreement, followed by dialectal consistency, based on evaluations across 10 language clusters and 60 varieties.

There has been little systematic study on how dialectal differences affect toxicity detection by modern LLMs. Furthermore, although using LLMs as evaluators ("LLM-as-a-judge") is a growing research area, their sensitivity to dialectal nuances is still underexplored and requires more focused attention. In this paper, we address these gaps through a comprehensive toxicity evaluation of LLMs across diverse dialects. We create a multi-dialect dataset through synthetic transformations and human-assisted translations, covering 10 language clusters and 60 varieties. We then evaluated three LLMs on their ability to assess toxicity across multilingual, dialectal, and LLM-human consistency. Our findings show that LLMs are sensitive in handling both multilingual and dialectal variations. However, if we have to rank the consistency, the weakest area is LLM-human agreement, followed by dialectal consistency. Code repository: \url{https://github.com/ffaisal93/dialect_toxicity_llm_judge}

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes