LG MLDec 12, 2023

Bayesian Online Learning for Consensus Prediction

Sam Showalter, Alex Boyd, Padhraic Smyth, Mark Steyvers

arXiv:2312.07679v13.83 citationsh-index: 55AISTATS

Originality Incremental advance

AI Analysis

This addresses a practical but under-explored setting for cost-effective consensus prediction in machine learning, though it is incremental as it builds on existing methods for handling expert feedback.

The paper tackles the problem of online classification with a pre-trained classifier and multiple human experts, where querying humans is costly and ground truth is unavailable, by proposing a Bayesian framework for dynamically estimating expert consensus from partial feedback, demonstrating effectiveness on CIFAR-10H and ImageNet-16H datasets.

Given a pre-trained classifier and multiple human experts, we investigate the task of online classification where model predictions are provided for free but querying humans incurs a cost. In this practical but under-explored setting, oracle ground truth is not available. Instead, the prediction target is defined as the consensus vote of all experts. Given that querying full consensus can be costly, we propose a general framework for online Bayesian consensus estimation, leveraging properties of the multivariate hypergeometric distribution. Based on this framework, we propose a family of methods that dynamically estimate expert consensus from partial feedback by producing a posterior over expert and model beliefs. Analyzing this posterior induces an interpretable trade-off between querying cost and classification performance. We demonstrate the efficacy of our framework against a variety of baselines on CIFAR-10H and ImageNet-16H, two large-scale crowdsourced datasets.

View on arXiv PDF

Similar