Revisiting Approximate Leverage Score Sketching for Matrix Least Squares

arXiv:2201.1063824.06 citationsh-index: 49

AI Analysis

Provides tighter theoretical guarantees for a known sketching method, benefiting practitioners in large-scale least squares regression.

The paper revisits approximate leverage score sketching for tall-and-skinny matrix least squares problems, deriving improved sample complexity bounds: exact leverage scores require 4r/(βδε) samples, and a hybrid deterministic-random scheme reduces samples by factor 1/(1-p_det).

We revisit the problem of sketching using approximate leverage scores for matrix least squares problems of the form $\| AX - B \|_F^2$ where the design matrix $A \in \mathbb{R}^{N \times r}$ is tall and skinny with $N \gg r$. We derive the theoretical results from first principles and clarify the relation to previously stated bounds, improving some constants along the way. One can characterize the utility of a sketching scheme according to the number of samples it needs for an $\varepsilon$-accurate solution with high probability. Assuming $\varepsilon$ is suitably small, we will show that approximate leverage score sampling requires $4r/(Î²Î´\varepsilon)$ samples, where $Î´$ is the failure probability and $Î²\in (0,1]$ is a measure of the quality of the approximate leverage scores such that $Î²=1$ corresponds to using exact leverage scores. In cases where a few approximate leverage scores are very large (summing to $p_{\rm det}$), we also show that using a hybrid deterministic and random sampling scheme reduces the required number of samples by a factor of $1/(1-p_{\rm det})$.

View on arXiv PDF

Similar