LG CC IT MLMar 8, 2023

Agnostic PAC Learning of k-juntas Using L2-Polynomial Regression

arXiv:2303.04859v12.01 citationsh-index: 9

Originality Highly original

AI Analysis

This provides a computationally efficient solution for learning juntas in agnostic settings, addressing a theoretical gap in machine learning since 1993.

The paper resolves the long-standing open problem of whether L2 regression is an agnostic PAC learner for 0-1 loss by proving it for k-juntas on the Boolean cube, and introduces a Fourier-based algorithm with lower computational complexity that works without distributional assumptions.

Many conventional learning algorithms rely on loss functions other than the natural 0-1 loss for computational efficiency and theoretical tractability. Among them are approaches based on absolute loss (L1 regression) and square loss (L2 regression). The first is proved to be an \textit{agnostic} PAC learner for various important concept classes such as \textit{juntas}, and \textit{half-spaces}. On the other hand, the second is preferable because of its computational efficiency, which is linear in the sample size. However, PAC learnability is still unknown as guarantees have been proved only under distributional restrictions. The question of whether L2 regression is an agnostic PAC learner for 0-1 loss has been open since 1993 and yet has to be answered. This paper resolves this problem for the junta class on the Boolean cube -- proving agnostic PAC learning of k-juntas using L2 polynomial regression. Moreover, we present a new PAC learning algorithm based on the Boolean Fourier expansion with lower computational complexity. Fourier-based algorithms, such as Linial et al. (1993), have been used under distributional restrictions, such as uniform distribution. We show that with an appropriate change, one can apply those algorithms in agnostic settings without any distributional assumption. We prove our results by connecting the PAC learning with 0-1 loss to the minimum mean square estimation (MMSE) problem. We derive an elegant upper bound on the 0-1 loss in terms of the MMSE error and show that the sign of the MMSE is a PAC learner for any concept class containing it.

View on arXiv PDF

Similar