LGMLOct 16, 2012

Active Learning with Distributional Estimates

arXiv:1210.4909v16 citations
Originality Highly original
AI Analysis

This work addresses the challenge of efficient data labeling in machine learning applications, offering an incremental improvement over existing active learning methods.

The paper tackles the problem of active learning by balancing decision boundary refinement and exploration of poorly sampled regions, introducing a novel scheme that models class conditional probability as a random variable with second-order distributions, achieving significantly better learning curves than popular methods like uncertainty sampling and error reduction sampling on UCI, USPS, and Caltech4 datasets.

Active Learning (AL) is increasingly important in a broad range of applications. Two main AL principles to obtain accurate classification with few labeled data are refinement of the current decision boundary and exploration of poorly sampled regions. In this paper we derive a novel AL scheme that balances these two principles in a natural way. In contrast to many AL strategies, which are based on an estimated class conditional probability ^p(y|x), a key component of our approach is to view this quantity as a random variable, hence explicitly considering the uncertainty in its estimated value. Our main contribution is a novel mathematical framework for uncertainty-based AL, and a corresponding AL scheme, where the uncertainty in ^p(y|x) is modeled by a second-order distribution. On the practical side, we show how to approximate such second-order distributions for kernel density classification. Finally, we find that over a large number of UCI, USPS and Caltech4 datasets, our AL scheme achieves significantly better learning curves than popular AL methods such as uncertainty sampling and error reduction sampling, when all use the same kernel density classifier.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes