CLAILGOct 3, 2023

On the definition of toxicity in NLP

arXiv:2310.02357v32 citationsh-index: 27
Originality Highly original
AI Analysis

This addresses the issue of subjective and vague data in toxicity detection for NLP researchers and practitioners, potentially improving model robustness and accuracy.

The paper tackles the problem of ill-defined toxicity in NLP by proposing a new stress-level-based definition designed to be objective and context-aware, and describes its application to dataset creation and model training.

The fundamental problem in toxicity detection task lies in the fact that the toxicity is ill-defined. This causes us to rely on subjective and vague data in models' training, which results in non-robust and non-accurate results: garbage in - garbage out. This work suggests a new, stress-level-based definition of toxicity designed to be objective and context-aware. On par with it, we also describe possible ways of applying this new definition to dataset creation and model training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes