Noise-tolerant, Reliable Active Classification with Comparison Queries
This addresses the need for robust and efficient active learning algorithms in handling massive unlabeled data, offering practical improvements for noisy real-world scenarios.
The paper tackles the problem of learning non-homogeneous linear separators efficiently under bounded noise by introducing comparison queries, achieving time and query efficiency with reliability guarantees such as error-free classifiers with high probability.
With the explosion of massive, widely available unlabeled data in the past years, finding label and time efficient, robust learning algorithms has become ever more important in theory and in practice. We study the paradigm of active learning, in which algorithms with access to large pools of data may adaptively choose what samples to label in the hope of exponentially increasing efficiency. By introducing comparisons, an additional type of query comparing two points, we provide the first time and query efficient algorithms for learning non-homogeneous linear separators robust to bounded (Massart) noise. We further provide algorithms for a generalization of the popular Tsybakov low noise condition, and show how comparisons provide a strong reliability guarantee that is often impractical or impossible with only labels - returning a classifier that makes no errors with high probability.