Approval Voting and Incentives in Crowdsourcing
This addresses the challenge of improving label quality in crowdsourcing for machine learning, though it appears incremental as it builds on existing voting and incentive mechanisms.
The paper tackles the problem of low-quality crowdsourced labels due to non-expert workers, misaligned incentives, and restrictive interfaces by introducing approval voting and an incentive-compatible compensation mechanism, showing theoretical optimality and preliminary empirical validation on Amazon Mechanical Turk.
The growing need for labeled training data has made crowdsourcing an important part of machine learning. The quality of crowdsourced labels is, however, adversely affected by three factors: (1) the workers are not experts; (2) the incentives of the workers are not aligned with those of the requesters; and (3) the interface does not allow workers to convey their knowledge accurately, by forcing them to make a single choice among a set of options. In this paper, we address these issues by introducing approval voting to utilize the expertise of workers who have partial knowledge of the true answer, and coupling it with a ("strictly proper") incentive-compatible compensation mechanism. We show rigorous theoretical guarantees of optimality of our mechanism together with a simple axiomatic characterization. We also conduct preliminary empirical studies on Amazon Mechanical Turk which validate our approach.