LGJun 13, 2022

On the reusability of samples in active learning

arXiv:2206.06276v1h-index: 34
Originality Synthesis-oriented
AI Analysis

This addresses a practical issue for machine learning practitioners using active learning, but it is incremental as it builds on existing active learning concepts without introducing a new method.

The paper tackles the problem of sample reusability in active learning, investigating whether samples selected for one learner can be reused by another, and concludes through theoretical arguments and experiments that universal reusability is impossible, with some limited reusability observed only in specific cases.

An interesting but not extensively studied question in active learning is that of sample reusability: to what extent can samples selected for one learner be reused by another? This paper explains why sample reusability is of practical interest, why reusability can be a problem, how reusability could be improved by importance-weighted active learning, and which obstacles to universal reusability remain. With theoretical arguments and practical demonstrations, this paper argues that universal reusability is impossible. Because every active learning strategy must undersample some areas of the sample space, learners that depend on the samples in those areas will learn more from a random sample selection. This paper describes several experiments with importance-weighted active learning that show the impact of the reusability problem in practice. The experiments confirmed that universal reusability does not exist, although in some cases -- on some datasets and with some pairs of classifiers -- there is sample reusability. Finally, this paper explores the conditions that could guarantee the reusability between two classifiers.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes