LGMLDec 30, 2024

Uncertainty Herding: One Active Learning Method for All Label Budgets

arXiv:2412.20644v26 citationsh-index: 8ICLR
Originality Highly original
AI Analysis

This solves the practical issue of needing different active learning methods for varying label budgets, benefiting machine learning practitioners by providing a single reliable method.

The paper tackles the problem of active learning methods performing poorly when label budgets are small or large, depending on the method, by proposing uncertainty coverage and Uncertainty Herding, which generalizes across budgets and matches or beats state-of-the-art performance in all tested cases.

Most active learning research has focused on methods which perform well when many labels are available, but can be dramatically worse than random selection when label budgets are small. Other methods have focused on the low-budget regime, but do poorly as label budgets increase. As the line between "low" and "high" budgets varies by problem, this is a serious issue in practice. We propose uncertainty coverage, an objective which generalizes a variety of low- and high-budget objectives, as well as natural, hyperparameter-light methods to smoothly interpolate between low- and high-budget regimes. We call greedy optimization of the estimate Uncertainty Herding; this simple method is computationally fast, and we prove that it nearly optimizes the distribution-level coverage. In experimental validation across a variety of active learning tasks, our proposal matches or beats state-of-the-art performance in essentially all cases; it is the only method of which we are aware that reliably works well in both low- and high-budget settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes