LG DSMay 12, 2023

Reduced Label Complexity For Tight $\ell_2$ Regression

arXiv:2305.07486v12.0

Originality Incremental advance

AI Analysis

This work addresses the challenge of minimizing label acquisition costs in regression tasks, offering a method to achieve tight approximations with fewer labels, though it is incremental as it leaves open the possibility of further reduction.

The paper tackles the problem of reducing label complexity in tight ℓ₂ regression by developing a polynomial algorithm that, without using labels, discards a subset of data points to achieve a (1+d/n)-approximation in expectation, reducing label complexity by Ω(√n).

Given data ${\rm X}\in\mathbb{R}^{n\times d}$ and labels $\mathbf{y}\in\mathbb{R}^{n}$ the goal is find $\mathbf{w}\in\mathbb{R}^d$ to minimize $\Vert{\rm X}\mathbf{w}-\mathbf{y}\Vert^2$. We give a polynomial algorithm that, \emph{oblivious to $\mathbf{y}$}, throws out $n/(d+\sqrt{n})$ data points and is a $(1+d/n)$-approximation to optimal in expectation. The motivation is tight approximation with reduced label complexity (number of labels revealed). We reduce label complexity by $Ω(\sqrt{n})$. Open question: Can label complexity be reduced by $Ω(n)$ with tight $(1+d/n)$-approximation?

View on arXiv PDF

Similar