Efficient active learning of sparse halfspaces
This work addresses the challenge of attribute-efficient active learning for sparse halfspaces, offering a computationally efficient solution that reduces label dependency on dimensionality, though it is incremental as it builds on prior assumptions and settings.
The paper tackles the problem of efficiently learning sparse halfspaces with minimal label queries in active learning, achieving a label complexity of O(t · polylog(d, 1/ε)) under distributional assumptions, which improves over existing methods that are either inefficient or require polynomial label requirements.
We study the problem of efficient PAC active learning of homogeneous linear classifiers (halfspaces) in $\mathbb{R}^d$, where the goal is to learn a halfspace with low error using as few label queries as possible. Under the extra assumption that there is a $t$-sparse halfspace that performs well on the data ($t \ll d$), we would like our active learning algorithm to be {\em attribute efficient}, i.e. to have label requirements sublinear in $d$. In this paper, we provide a computationally efficient algorithm that achieves this goal. Under certain distributional assumptions on the data, our algorithm achieves a label complexity of $O(t \cdot \mathrm{polylog}(d, \frac 1 ε))$. In contrast, existing algorithms in this setting are either computationally inefficient, or subject to label requirements polynomial in $d$ or $\frac 1 ε$.