LG MESep 29, 2025

Crowdsourcing Without People: Modelling Clustering Algorithms as Experts

Jordyn E. A. Lorentz, Katharine M. Clark

arXiv:2509.25395v1

Originality Incremental advance

AI Analysis

This provides a robust alternative for non-expert users when the true data structure is unknown, though it is incremental as it adapts an existing model to a new context.

The paper tackled the problem of aggregating predictions from multiple clustering algorithms by adapting the Dawid-Skene model to treat algorithm outputs as noisy annotations, resulting in a method that consistently approaches the best performance and avoids poor outcomes, as shown in experiments on simulated and real-world datasets.

This paper introduces mixsemble, an ensemble method that adapts the Dawid-Skene model to aggregate predictions from multiple model-based clustering algorithms. Unlike traditional crowdsourcing, which relies on human labels, the framework models the outputs of clustering algorithms as noisy annotations. Experiments on both simulated and real-world datasets show that, although the mixsemble is not always the single top performer, it consistently approaches the best result and avoids poor outcomes. This robustness makes it a practical alternative when the true data structure is unknown, especially for non-expert users.

View on arXiv PDF

Similar