CLAug 14, 2019

Debiasing Personal Identities in Toxicity Classification

arXiv:1908.05757v15 citations
AI Analysis

This addresses bias in automated decisions for text classification, but it is incremental as it focuses on detection and trade-offs rather than a new solution.

The paper tackled bias in toxicity classification by measuring model performance across demographic subgroups and comparing them to overall results, finding that completely removing bias from a dataset can lead to worse outcomes.

As Machine Learning models continue to be relied upon for making automated decisions, the issue of model bias becomes more and more prevalent. In this paper, we approach training a text classifica-tion model and optimize on bias minimization by measuring not only the models performance on our dataset as a whole, but also how it performs across different subgroups. This requires measuring per-formance independently for different demographic subgroups and measuring bias by comparing them to results from the rest of our data. We show how unintended bias can be detected using these metrics and how removing bias from a dataset completely can result in worse results.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes