CLOct 14, 2024

Assessing the Human Likeness of AI-Generated Counterspeech

arXiv:2410.11007v220 citationsh-index: 4COLING
Originality Synthesis-oriented
AI Analysis

This addresses the effectiveness of counterspeech for online moderation, but it is incremental as it builds on existing strategies by focusing on human likeness.

The paper tackled the problem of assessing how human-like AI-generated counterspeech is, finding that AI-generated and human-written counterspeech can be easily distinguished by classifiers and humans, with differences in linguistic characteristics, politeness, and specificity.

Counterspeech is a targeted response to counteract and challenge abusive or hateful content. It effectively curbs the spread of hatred and fosters constructive online communication. Previous studies have proposed different strategies for automatically generated counterspeech. Evaluations, however, focus on relevance, surface form, and other shallow linguistic characteristics. This paper investigates the human likeness of AI-generated counterspeech, a critical factor influencing effectiveness. We implement and evaluate several LLM-based generation strategies, and discover that AI-generated and human-written counterspeech can be easily distinguished by both simple classifiers and humans. Further, we reveal differences in linguistic characteristics, politeness, and specificity. The dataset used in this study is publicly available for further research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes