Attribute-Efficient PAC Learning of Sparse Halfspaces with Constant Malicious Noise Rate
This addresses a fundamental challenge in machine learning theory for robust learning under adversarial noise, though it appears incremental as it builds on existing hinge loss methods with specific modifications.
The paper tackles the problem of attribute-efficient PAC learning of sparse halfspaces in the presence of a constant rate of malicious noise, achieving this with poly(s, log d) samples under distributional conditions like concentration and margin. It introduces a variant of hinge loss minimization with a new gradient analysis to handle sparsity constraints.
Attribute-efficient learning of sparse halfspaces has been a fundamental problem in machine learning theory. In recent years, machine learning algorithms are faced with prevalent data corruptions or even adversarial attacks. It is of central interest to design efficient algorithms that are robust to noise corruptions. In this paper, we consider that there exists a constant amount of malicious noise in the data and the goal is to learn an underlying $s$-sparse halfspace $w^* \in \mathbb{R}^d$ with $\text{poly}(s,\log d)$ samples. Specifically, we follow a recent line of works and assume that the underlying distribution satisfies a certain concentration condition and a margin condition at the same time. Under such conditions, we show that attribute-efficiency can be achieved by simple variants to existing hinge loss minimization programs. Our key contribution includes: 1) an attribute-efficient PAC learning algorithm that works under constant malicious noise rate; 2) a new gradient analysis that carefully handles the sparsity constraint in hinge loss minimization.