CLMar 9, 2021

Detecting Inappropriate Messages on Sensitive Topics that Could Harm a Company's Reputation

arXiv:2103.05345v1801 citations
Originality Synthesis-oriented
AI Analysis

This addresses a domain-specific problem for companies needing to manage reputation risks in user-generated content, but it is incremental as it builds on existing toxicity detection work.

The paper tackles the problem of detecting inappropriate messages on sensitive topics that could harm a company's reputation by defining a fine-grained notion of inappropriateness, distinct from toxicity, and releases datasets and pre-trained models for Russian.

Not all topics are equally "flammable" in terms of toxicity: a calm discussion of turtles or fishing less often fuels inappropriate toxic dialogues than a discussion of politics or sexual minorities. We define a set of sensitive topics that can yield inappropriate and toxic messages and describe the methodology of collecting and labeling a dataset for appropriateness. While toxicity in user-generated data is well-studied, we aim at defining a more fine-grained notion of inappropriateness. The core of inappropriateness is that it can harm the reputation of a speaker. This is different from toxicity in two respects: (i) inappropriateness is topic-related, and (ii) inappropriate message is not toxic but still unacceptable. We collect and release two datasets for Russian: a topic-labeled dataset and an appropriateness-labeled dataset. We also release pre-trained classification models trained on this data.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes