LGHCJul 5, 2022

Unsupervised Crowdsourcing with Accuracy and Cost Guarantees

arXiv:2207.01988v12 citationsh-index: 13
Originality Incremental advance
AI Analysis

This work addresses cost and accuracy challenges in crowdsourcing for binary classification, offering a solution with theoretical guarantees, though it appears incremental as it builds on existing models and algorithms.

The paper tackles the problem of cost-effective unsupervised crowdsourcing for binary classification with a given error threshold, proposing algorithms that guarantee error bounds and achieve near-optimal cost when sufficient unlabeled items are available.

We consider the problem of cost-optimal utilization of a crowdsourcing platform for binary, unsupervised classification of a collection of items, given a prescribed error threshold. Workers on the crowdsourcing platform are assumed to be divided into multiple classes, based on their skill, experience, and/or past performance. We model each worker class via an unknown confusion matrix, and a (known) price to be paid per label prediction. For this setting, we propose algorithms for acquiring label predictions from workers, and for inferring the true labels of items. We prove that if the number of (unlabeled) items available is large enough, our algorithms satisfy the prescribed error thresholds, incurring a cost that is near-optimal. Finally, we validate our algorithms, and some heuristics inspired by them, through an extensive case study.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes