LGJul 16, 2024

Generalized Coverage for More Robust Low-Budget Active Learning

arXiv:2407.12212v211 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses robustness issues in low-budget active learning for image classification, offering an incremental improvement over prior methods.

The paper tackles the sensitivity of the ProbCover active learning method to hyper-parameter choice by introducing a generalized coverage objective, resulting in MaxHerding, which outperforms existing methods on low-budget image classification benchmarks with less computational cost.

The ProbCover method of Yehuda et al. is a well-motivated algorithm for active learning in low-budget regimes, which attempts to "cover" the data distribution with balls of a given radius at selected data points. We demonstrate, however, that the performance of this algorithm is extremely sensitive to the choice of this radius hyper-parameter, and that tuning it is quite difficult, with the original heuristic frequently failing. We thus introduce (and theoretically motivate) a generalized notion of "coverage," including ProbCover's objective as a special case, but also allowing smoother notions that are far more robust to hyper-parameter choice. We propose an efficient greedy method to optimize this coverage, generalizing ProbCover's algorithm; due to its close connection to kernel herding, we call it "MaxHerding." The objective can also be optimized non-greedily through a variant of $k$-medoids, clarifying the relationship to other low-budget active learning methods. In comprehensive experiments, MaxHerding surpasses existing active learning methods across multiple low-budget image classification benchmarks, and does so with less computational cost than most competitive methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes