Iterative Chow Filtering for Learning with Distribution Shift
Provides the first efficient PQ learning algorithm for DNFs, addressing a key bottleneck in learning with distribution shift for this important class.
The paper shows that L1 sandwiching suffices for efficient PQ learning under distribution shift, yielding the first quasipolynomial-time algorithm for DNFs under uniform distribution and exponential improvements for several function classes.
Recent work due to Goel et al. gave the first efficient algorithms for learning with distribution shift in the challenging PQ framework. In this setting, a learner receives labeled training examples, unlabeled test examples, and must make correct predictions on the test set but is allowed to abstain from predicting on out-of-distribution points. Their results rely on ${\cal L}_2$ sandwiching approximations, a strong requirement that leads to poor bounds for several basic function classes such as DNF formulas. Here, we show that the weaker notion of ${\cal L}_1$ sandwiching suffices for efficient PQ learning. As a consequence, we obtain the first quasipolynomial-time PQ learning algorithm for DNFs under the uniform distribution and essentially match the guarantees known for ordinary PAC learning. More broadly, our bounds provide exponential improvements for several classes including constant depth circuits and constant degree polynomial threshold functions. Our main technical ingredient is Iterative Chow Filtering, a new procedure that uses low-degree Chow parameters to identify and remove test points incompatible with the training distribution.