CLAIFeb 18, 2024

A Multi-Aspect Framework for Counter Narrative Evaluation using Large Language Models

arXiv:2402.11676v232 citationsh-index: 7NAACL
Originality Incremental advance
AI Analysis

This addresses the need for better evaluation methods in hate speech intervention, though it is incremental as it builds on prior work in counter-narrative generation.

The paper tackled the problem of evaluating automatically generated counter-narratives for hate speech, which lacked alignment with human judgment, by proposing a framework using large language models to score based on five quality aspects, achieving strong alignment with human scores and outperforming existing metrics.

Counter narratives - informed responses to hate speech contexts designed to refute hateful claims and de-escalate encounters - have emerged as an effective hate speech intervention strategy. While previous work has proposed automatic counter narrative generation methods to aid manual interventions, the evaluation of these approaches remains underdeveloped. Previous automatic metrics for counter narrative evaluation lack alignment with human judgment as they rely on superficial reference comparisons instead of incorporating key aspects of counter narrative quality as evaluation criteria. To address prior evaluation limitations, we propose a novel evaluation framework prompting LLMs to provide scores and feedback for generated counter narrative candidates using 5 defined aspects derived from guidelines from counter narrative specialized NGOs. We found that LLM evaluators achieve strong alignment to human-annotated scores and feedback and outperform alternative metrics, indicating their potential as multi-aspect, reference-free and interpretable evaluators for counter narrative evaluation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes