LGDSMLJun 26, 2018

Conditional Sparse $\ell_p$-norm Regression With Optimal Probability

arXiv:1806.10222v1
Originality Incremental advance
AI Analysis

This work addresses the challenge of learning accurate models on subsets of data where simple rules apply, which is incremental as it builds on prior state-of-the-art with better guarantees.

The paper tackles the problem of conditional linear regression to identify a subset of data where a linear rule provides a good fit, improving on prior algorithms that had poor probability guarantees and loss approximations. It presents efficient algorithms that nearly match the ideal condition's probability and enhance the loss approximation, with specific bounds like O(n^k) for sparse regression.

We consider the following conditional linear regression problem: the task is to identify both (i) a $k$-DNF condition $c$ and (ii) a linear rule $f$ such that the probability of $c$ is (approximately) at least some given bound $μ$, and $f$ minimizes the $\ell_p$ loss of predicting the target $z$ in the distribution of examples conditioned on $c$. Thus, the task is to identify a portion of the distribution on which a linear rule can provide a good fit. Algorithms for this task are useful in cases where simple, learnable rules only accurately model portions of the distribution. The prior state-of-the-art for such algorithms could only guarantee finding a condition of probability $Ω(μ/n^k)$ when a condition of probability $μ$ exists, and achieved an $O(n^k)$-approximation to the target loss, where $n$ is the number of Boolean attributes. Here, we give efficient algorithms for solving this task with a condition $c$ that nearly matches the probability of the ideal condition, while also improving the approximation to the target loss. We also give an algorithm for finding a $k$-DNF reference class for prediction at a given query point, that obtains a sparse regression fit that has loss within $O(n^k)$ of optimal among all sparse regression parameters and sufficiently large $k$-DNF reference classes containing the query point.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes