Crowdsourcing Without People: Modelling Clustering Algorithms as Experts
This provides a robust alternative for non-expert users when the true data structure is unknown, though it is incremental as it adapts an existing model to a new context.
The paper tackled the problem of aggregating predictions from multiple clustering algorithms by adapting the Dawid-Skene model to treat algorithm outputs as noisy annotations, resulting in a method that consistently approaches the best performance and avoids poor outcomes, as shown in experiments on simulated and real-world datasets.
This paper introduces mixsemble, an ensemble method that adapts the Dawid-Skene model to aggregate predictions from multiple model-based clustering algorithms. Unlike traditional crowdsourcing, which relies on human labels, the framework models the outputs of clustering algorithms as noisy annotations. Experiments on both simulated and real-world datasets show that, although the mixsemble is not always the single top performer, it consistently approaches the best result and avoids poor outcomes. This robustness makes it a practical alternative when the true data structure is unknown, especially for non-expert users.