Direct Acquisition Optimization for Low-Budget Active Learning
This addresses the challenge of deploying data-intensive ML models in domains with limited labeled data, though it is an incremental improvement focused on low-budget settings.
The paper tackles the problem of active learning performance degradation under low labeling budgets by introducing Direct Acquisition Optimization (DAO), which optimizes sample selection based on expected true loss reduction and outperforms state-of-the-art methods across seven benchmarks.
Active Learning (AL) has gained prominence in integrating data-intensive machine learning (ML) models into domains with limited labeled data. However, its effectiveness diminishes significantly when the labeling budget is low. In this paper, we first empirically observe the performance degradation of existing AL algorithms in the low-budget settings, and then introduce Direct Acquisition Optimization (DAO), a novel AL algorithm that optimizes sample selections based on expected true loss reduction. Specifically, DAO utilizes influence functions to update model parameters and incorporates an additional acquisition strategy to mitigate bias in loss estimation. This approach facilitates a more accurate estimation of the overall error reduction, without extensive computations or reliance on labeled data. Experiments demonstrate DAO's effectiveness in low budget settings, outperforming state-of-the-arts approaches across seven benchmarks.