HC LGJan 25, 2024

Efficient Online Crowdsourcing with Complex Annotations

Reshef Meir, Viet-An Nguyen, Xu Chen, Jagdish Ramakrishnan, Udi Weinsberg

arXiv:2401.15116v12.71 citationsAAAI

Originality Highly original

AI Analysis

This work addresses the challenge of optimizing annotation efficiency in crowdsourcing platforms, which is incremental as it builds on existing truth discovery methods for complex tasks.

The paper tackles the problem of efficiently trading off cost and quality in online crowdsourcing for complex annotations like bounding boxes, by proposing a novel approach that infers labeler accuracy based on reported labels, and demonstrates effectiveness on real-world Meta data with improved cost-quality trade-offs.

Crowdsourcing platforms use various truth discovery algorithms to aggregate annotations from multiple labelers. In an online setting, however, the main challenge is to decide whether to ask for more annotations for each item to efficiently trade off cost (i.e., the number of annotations) for quality of the aggregated annotations. In this paper, we propose a novel approach for general complex annotation (such as bounding boxes and taxonomy paths), that works in an online crowdsourcing setting. We prove that the expected average similarity of a labeler is linear in their accuracy \emph{conditional on the reported label}. This enables us to infer reported label accuracy in a broad range of scenarios. We conduct extensive evaluations on real-world crowdsourcing data from Meta and show the effectiveness of our proposed online algorithms in improving the cost-quality trade-off.

View on arXiv PDF

Similar