CLMay 5, 2025

A Comparative Benchmark of a Moroccan Darija Toxicity Detection Model (Typica.ai) and Major LLM-Based Moderation APIs (OpenAI, Mistral, Anthropic)

arXiv:2505.04640v11 citationsh-index: 1
Originality Synthesis-oriented
AI Analysis

This addresses the problem of reliable content moderation in underrepresented languages like Moroccan Darija, though it is incremental as it benchmarks existing approaches rather than introducing new methods.

This paper compared a custom Moroccan Darija toxicity detection model (Typica.ai) against major LLM-based moderation APIs (OpenAI, Mistral, Anthropic) on culturally grounded toxic content, finding Typica.ai achieved superior performance with metrics like precision, recall, F1-score, and accuracy reported.

This paper presents a comparative benchmark evaluating the performance of Typica.ai's custom Moroccan Darija toxicity detection model against major LLM-based moderation APIs: OpenAI (omni-moderation-latest), Mistral (mistral-moderation-latest), and Anthropic Claude (claude-3-haiku-20240307). We focus on culturally grounded toxic content, including implicit insults, sarcasm, and culturally specific aggression often overlooked by general-purpose systems. Using a balanced test set derived from the OMCD_Typica.ai_Mix dataset, we report precision, recall, F1-score, and accuracy, offering insights into challenges and opportunities for moderation in underrepresented languages. Our results highlight Typica.ai's superior performance, underlining the importance of culturally adapted models for reliable content moderation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes