Efficient Active Learning with Abstention
This work addresses the label efficiency challenge in machine learning by proposing a novel abstention-based approach, offering significant improvements for scenarios where labeling is costly or limited.
The paper tackles the problem of active learning by introducing an algorithm that can abstain from predictions on difficult examples, achieving polylog(1/ε) label complexity without low noise conditions, which reduces label complexity exponentially compared to passive and non-abstaining active learning.
The goal of active learning is to achieve the same accuracy achievable by passive learning, while using much fewer labels. Exponential savings in terms of label complexity have been proved in very special cases, but fundamental lower bounds show that such improvements are impossible in general. This suggests a need to explore alternative goals for active learning. Learning with abstention is one such alternative. In this setting, the active learning algorithm may abstain from prediction and incur an error that is marginally smaller than random guessing. We develop the first computationally efficient active learning algorithm with abstention. Our algorithm provably achieves $\mathsf{polylog}(\frac{1}{\varepsilon})$ label complexity, without any low noise conditions. Such performance guarantee reduces the label complexity by an exponential factor, relative to passive learning and active learning that is not allowed to abstain. Furthermore, our algorithm is guaranteed to only abstain on hard examples (where the true label distribution is close to a fair coin), a novel property we term proper abstention that also leads to a host of other desirable characteristics (e.g., recovering minimax guarantees in the standard setting, and avoiding the undesirable "noise-seeking" behavior often seen in active learning). We also provide novel extensions of our algorithm that achieve constant label complexity and deal with model misspecification.