HCOct 17, 2016

Optimizing Open-Ended Crowdsourcing: The Next Frontier in Crowdsourced Data Management

arXiv:1610.05377v111 citations
Originality Synthesis-oriented
AI Analysis

It tackles the problem of improving data quality in crowdsourcing for researchers and practitioners, but is incremental as it reviews existing approaches.

This paper surveys the field of optimizing open-ended crowdsourcing, addressing challenges like resolving worker disagreements and selecting appropriate operators, based on the authors' experiences.

Crowdsourcing is the primary means to generate training data at scale, and when combined with sophisticated machine learning algorithms, crowdsourcing is an enabler for a variety of emergent automated applications impacting all spheres of our lives. This paper surveys the emerging field of formally reasoning about and optimizing open-ended crowdsourcing, a popular and crucially important, but severely understudied class of crowdsourcing---the next frontier in crowdsourced data management. The underlying challenges include distilling the right answer when none of the workers agree with each other, teasing apart the various perspectives adopted by workers when answering tasks, and effectively selecting between the many open-ended operators appropriate for a problem. We describe the approaches that we've found to be effective for open-ended crowdsourcing, drawing from our experiences in this space.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes