HCJan 10, 2018

Exploring Stereotypes and Biased Data with the Crowd

arXiv:1801.03261v17.35 citations

Originality Synthesis-oriented

AI Analysis

This addresses bias in machine learning data collection, but it is incremental as it explores a preliminary method without large-scale validation.

The study investigated whether crowdsourced workers could anticipate stereotypes that might bias datasets, finding that crowd suggestions showed diversity and could potentially help prevent bias during data collection.

The goal of our research is to contribute information about how useful the crowd is at anticipating stereotypes that may be biasing a data set without a researcher's knowledge. The results of the crowd's prediction can potentially be used during data collection to help prevent the suspected stereotypes from introducing bias to the dataset. We conduct our research by asking the crowd on Amazon's Mechanical Turk (AMT) to complete two similar Human Intelligence Tasks (HITs) by suggesting stereotypes relating to their personal experience. Our analysis of these responses focuses on determining the level of diversity in the workers' suggestions and their demographics. Through this process we begin a discussion on how useful the crowd can be in tackling this difficult problem within machine learning data collection.

View on arXiv PDF

Similar