AI LGOct 16, 2012

Crowdsourcing Control: Moving Beyond Multiple Choice

arXiv:1210.4870v185 citations

Originality Incremental advance

AI Analysis

This addresses the limitation of existing crowdsourcing models that assume finite outcomes, enabling more accurate and efficient handling of open-ended tasks for platforms like Amazon Mechanical Turk, though it is an incremental improvement over prior methods.

The paper tackled the problem of crowdsourcing tasks that require free-response answers with infinite outcome spaces, such as audio transcription, by developing LazySusan, a decision-theoretic controller that dynamically requests responses. The result showed that LazySusan eliminated 83.2% of error and achieved greater net utility compared to state-of-the-art majority-voting in live experiments on SAT Math questions.

To ensure quality results from crowdsourced tasks, requesters often aggregate worker responses and use one of a plethora of strategies to infer the correct answer from the set of noisy responses. However, all current models assume prior knowledge of all possible outcomes of the task. While not an unreasonable assumption for tasks that can be posited as multiple-choice questions (e.g. n-ary classification), we observe that many tasks do not naturally fit this paradigm, but instead demand a free-response formulation where the outcome space is of infinite size (e.g. audio transcription). We model such tasks with a novel probabilistic graphical model, and design and implement LazySusan, a decision-theoretic controller that dynamically requests responses as necessary in order to infer answers to these tasks. We also design an EM algorithm to jointly learn the parameters of our model while inferring the correct answers to multiple tasks at a time. Live experiments on Amazon Mechanical Turk demonstrate the superiority of LazySusan at solving SAT Math questions, eliminating 83.2% of the error and achieving greater net utility compared to the state-ofthe-art strategy, majority-voting. We also show in live experiments that our EM algorithm outperforms majority-voting on a visualization task that we design.

View on arXiv PDF

Similar