One-Round Active Learning
This addresses the challenge of efficient data selection in active learning for scenarios where only one round of labeling is feasible, representing an incremental improvement over existing methods.
The paper tackles the problem of selecting unlabeled data points in one-round active learning to maximize model performance with limited initial labeled data, proposing DULO, a data-driven framework that predicts model performance for datasets, and achieves state-of-the-art results on various benchmarks.
In this work, we initiate the study of one-round active learning, which aims to select a subset of unlabeled data points that achieve the highest model performance after being labeled with only the information from initially labeled data points. The challenge of directly applying existing data selection criteria to the one-round setting is that they are not indicative of model performance when available labeled data is limited. We address the challenge by explicitly modeling the dependence of model performance on the dataset. Specifically, we propose DULO, a data-driven framework for one-round active learning, wherein we learn a model to predict the model performance for a given dataset and then leverage this model to guide the selection of unlabeled data. Our results demonstrate that DULO leads to the state-of-the-art performance on various active learning benchmarks in the one-round setting.