LGMLAug 3, 2020

Active Classification with Uncertainty Comparison Queries

arXiv:2008.00645v23 citations
AI Analysis

This work addresses inefficiencies in interactive learning of binary classifiers for applications like data labeling, though it is incremental as it builds on existing comparison oracles.

The paper tackles the problem of inefficient label inference in active classification by proposing a new pairwise comparison oracle that answers which of two data points has higher uncertainty, and an adaptive labeling algorithm using this oracle and a positivity comparison oracle, achieving improved query complexity and practical performance as confirmed theoretically and empirically.

Noisy pairwise comparison feedback has been incorporated to improve the overall query complexity of interactively learning binary classifiers. The \textit{positivity comparison oracle} is used to provide feedback on which is more likely to be positive given a pair of data points. Because it is impossible to infer accurate labels using this oracle alone \textit{without knowing the classification threshold}, existing methods still rely on the traditional \textit{explicit labeling oracle}, which directly answers the label given a data point. Existing methods conduct sorting on all data points and use explicit labeling oracle to find the classification threshold. The current methods, however, have two drawbacks: (1) they needs unnecessary sorting for label inference; (2) quick sort is naively adapted to noisy feedback and negatively affects practical performance. In order to avoid this inefficiency and acquire information of the classification threshold, we propose a new pairwise comparison oracle concerning uncertainties. This oracle receives two data points as input and answers which one has higher uncertainty. We then propose an efficient adaptive labeling algorithm using the proposed oracle and the positivity comparison oracle. In addition, we also address the situation where the labeling budget is insufficient compared to the dataset size, which can be dealt with by plugging the proposed algorithm into an active learning algorithm. Furthermore, we confirm the feasibility of the proposed oracle and the performance of the proposed algorithm theoretically and empirically.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes