LGMay 23, 2022

Active Learning Through a Covering Lens

arXiv:2205.11320v381 citationsh-index: 46Has Code
Originality Highly original
AI Analysis

This work addresses the high annotation costs for deep learning models, particularly in low-budget scenarios, offering a practical solution for researchers and practitioners in computer vision.

The paper tackles the problem of reducing annotation costs for deep models in the low-budget active learning regime by proposing ProbCover, a new algorithm that maximizes probability coverage. The method improves state-of-the-art performance on several image recognition benchmarks, especially in semi-supervised settings, allowing near-fully supervised results with fewer labels.

Deep active learning aims to reduce the annotation cost for the training of deep models, which is notoriously data-hungry. Until recently, deep active learning methods were ineffectual in the low-budget regime, where only a small number of examples are annotated. The situation has been alleviated by recent advances in representation and self-supervised learning, which impart the geometry of the data representation with rich information about the points. Taking advantage of this progress, we study the problem of subset selection for annotation through a "covering" lens, proposing ProbCover - a new active learning algorithm for the low budget regime, which seeks to maximize Probability Coverage. We then describe a dual way to view the proposed formulation, from which one can derive strategies suitable for the high budget regime of active learning, related to existing methods like Coreset. We conclude with extensive experiments, evaluating ProbCover in the low-budget regime. We show that our principled active learning strategy improves the state-of-the-art in the low-budget regime in several image recognition benchmarks. This method is especially beneficial in the semi-supervised setting, allowing state-of-the-art semi-supervised methods to match the performance of fully supervised methods, while using much fewer labels nonetheless. Code is available at https://github.com/avihu111/TypiClust.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes