CLMay 9, 2023

Mitigating Bias in Text Classification via Prompt-Based Text Transformation

arXiv:2305.06166v31 citations
AI Analysis

This addresses bias in automated decision-making for text classification, offering a practical but incremental approach.

The paper tackled bias in text classification by using ChatGPT prompts to rewrite text, which reduced location classification accuracy significantly while preserving sentiment and rating prediction performance.

The presence of specific linguistic signals particular to a certain sub-group can become highly salient to language models during training. In automated decision-making settings, this may lead to biased outcomes when models rely on cues that correlate with protected characteristics. We investigate whether prompting ChatGPT to rewrite text using simplification, neutralisation, localisation, and formalisation can reduce demographic signals while preserving meaning. Experimental results show a statistically significant drop in location classification accuracy across multiple models after transformation, suggesting reduced reliance on group-specific language. At the same time, sentiment analysis and rating prediction tasks confirm that the core meaning of the reviews remains greatly intact. These results suggest that prompt-based rewriting offers a practical and generalisable approach for mitigating bias in text classification.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes