HCAIApr 4, 2020

Measuring Social Biases of Crowd Workers using Counterfactual Queries

arXiv:2004.02028v116 citations
AI Analysis

This addresses the issue of biased training data in machine learning, which can propagate social biases into AI systems, though it is incremental as it builds on existing fairness concepts.

The paper tackles the problem of social biases in crowdsourced datasets by proposing a method to quantify each crowd worker's inherent bias using counterfactual fairness, enabling the curation of less biased datasets.

Social biases based on gender, race, etc. have been shown to pollute machine learning (ML) pipeline predominantly via biased training datasets. Crowdsourcing, a popular cost-effective measure to gather labeled training datasets, is not immune to the inherent social biases of crowd workers. To ensure such social biases aren't passed onto the curated datasets, it's important to know how biased each crowd worker is. In this work, we propose a new method based on counterfactual fairness to quantify the degree of inherent social bias in each crowd worker. This extra information can be leveraged together with individual worker responses to curate a less biased dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes