Cheaper and Better: Selecting Good Workers for Crowdsourcing
This addresses cost-effective data collection for crowdsourcing platforms, though it appears incremental as it builds on existing worker selection methods.
The paper tackles the problem of selecting high-quality workers from a pool to maximize accuracy under a budget constraint in crowdsourcing, showing that their algorithm selects a small number of workers and performs as well as or better than larger crowds.
Crowdsourcing provides a popular paradigm for data collection at scale. We study the problem of selecting subsets of workers from a given worker pool to maximize the accuracy under a budget constraint. One natural question is whether we should hire as many workers as the budget allows, or restrict on a small number of top-quality workers. By theoretically analyzing the error rate of a typical setting in crowdsourcing, we frame the worker selection problem into a combinatorial optimization problem and propose an algorithm to solve it efficiently. Empirical results on both simulated and real-world datasets show that our algorithm is able to select a small number of high-quality workers, and performs as good as, sometimes even better than, the much larger crowds as the budget allows.