LGMLApr 19, 2019

Disagreement-based Active Learning in Online Settings

arXiv:1904.09056v517 citations
Originality Incremental advance
AI Analysis

This work addresses efficient learning in streaming environments for applications like real-time data processing, though it is incremental as it builds on existing disagreement-based methods in online settings.

The paper tackles the problem of online active learning for streaming data classification by developing a disagreement-based algorithm that minimizes label queries while bounding prediction errors, achieving a label complexity of O(dT^{(2-2α)/(2-α)} log^2 T) under Tsybakov noise and proving its order optimality with a matching lower bound.

We study online active learning for classifying streaming instances within the framework of statistical learning theory. At each time, the learner either queries the label of the current instance or predicts the label based on past seen examples. The objective is to minimize the number of queries while constraining the number of prediction errors over a horizon of length $T$. We develop a disagreement-based online learning algorithm for a general hypothesis space and under the Tsybakov noise. We show that the proposed algorithm has a label complexity of $O(dT^{\frac{2-2α}{2-α}}\log^2 T)$ under a constraint of bounded regret in terms of classification errors, where $d$ is the VC dimension of the hypothesis space and $α$ is the Tsybakov noise parameter. We further establish a matching (up to a poly-logarithmic factor) lower bound, demonstrating the order optimality of the proposed algorithm. We address the tradeoff between label complexity and regret and show that the algorithm can be modified to operate at a different point on the tradeoff curve.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes