LGMLNov 19, 2013

Beating the Minimax Rate of Active Learning with Prior Knowledge

arXiv:1311.4803v2
Originality Incremental advance
AI Analysis

This work addresses the challenge of reducing labeling costs in machine learning for scenarios with noisy data, representing an incremental advance by extending prior theoretical guarantees.

The paper tackles the problem of active learning's label complexity under the Tsybakov noise condition with parameter κ > 1, proposing a novel algorithm using a convex surrogate loss that achieves an exponential reduction in label complexity, broadening cases where active learning improves over passive learning.

Active learning refers to the learning protocol where the learner is allowed to choose a subset of instances for labeling. Previous studies have shown that, compared with passive learning, active learning is able to reduce the label complexity exponentially if the data are linearly separable or satisfy the Tsybakov noise condition with parameter $κ=1$. In this paper, we propose a novel active learning algorithm using a convex surrogate loss, with the goal to broaden the cases for which active learning achieves an exponential improvement. We make use of a convex loss not only because it reduces the computational cost, but more importantly because it leads to a tight bound for the empirical process (i.e., the difference between the empirical estimation and the expectation) when the current solution is close to the optimal one. Under the assumption that the norm of the optimal classifier that minimizes the convex risk is available, our analysis shows that the introduction of the convex surrogate loss yields an exponential reduction in the label complexity even when the parameter $κ$ of the Tsybakov noise is larger than $1$. To the best of our knowledge, this is the first work that improves the minimax rate of active learning by utilizing certain priori knowledge.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes