LG AI OCDec 14, 2025

Optimal Labeler Assignment and Sampling for Active Learning in the Presence of Imperfect Labels

Pouya Ahadi, Blair Winograd, Camille Zaug, Karunesh Arora, Lijun Wang, Kamran Paynabar

arXiv:2512.12870v14.1

Originality Incremental advance

AI Analysis

This addresses the challenge of building accurate classifiers in active learning when labels are noisy, which is incremental as it refines existing methods for handling labeler imperfections.

The paper tackles the problem of label noise in active learning due to imperfect labelers, proposing a framework that optimally assigns queries to labelers and selects samples to minimize noise, resulting in significantly improved classification performance compared to benchmarks.

Active Learning (AL) has garnered significant interest across various application domains where labeling training data is costly. AL provides a framework that helps practitioners query informative samples for annotation by oracles (labelers). However, these labels often contain noise due to varying levels of labeler accuracy. Additionally, uncertain samples are more prone to receiving incorrect labels because of their complexity. Learning from imperfectly labeled data leads to an inaccurate classifier. We propose a novel AL framework to construct a robust classification model by minimizing noise levels. Our approach includes an assignment model that optimally assigns query points to labelers, aiming to minimize the maximum possible noise within each cycle. Additionally, we introduce a new sampling method to identify the best query points, reducing the impact of label noise on classifier performance. Our experiments demonstrate that our approach significantly improves classification performance compared to several benchmark methods.

View on arXiv PDF

Similar