CLAIOct 17, 2022

Social Biases in Automatic Evaluation Metrics for NLG

arXiv:2210.08859v14 citationsh-index: 14
Originality Incremental advance
AI Analysis

This work addresses the issue of fairness and reliability in automated evaluation for NLP practitioners, but it is incremental as it extends existing bias detection methods to evaluation metrics.

The study tackled the problem of social biases in automatic evaluation metrics for natural language generation by quantifying biases using WEAT and SEAT methods, revealing that model-based metrics exhibit gender bias, with results showing a preference for male hypotheses and significant performance variation after gender swapping in tasks like image captioning and text summarization.

Many studies have revealed that word embeddings, language models, and models for specific downstream tasks in NLP are prone to social biases, especially gender bias. Recently these techniques have been gradually applied to automatic evaluation metrics for text generation. In the paper, we propose an evaluation method based on Word Embeddings Association Test (WEAT) and Sentence Embeddings Association Test (SEAT) to quantify social biases in evaluation metrics and discover that social biases are also widely present in some model-based automatic evaluation metrics. Moreover, we construct gender-swapped meta-evaluation datasets to explore the potential impact of gender bias in image caption and text summarization tasks. Results show that given gender-neutral references in the evaluation, model-based evaluation metrics may show a preference for the male hypothesis, and the performance of them, i.e. the correlation between evaluation metrics and human judgments, usually has more significant variation after gender swapping.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes