LGCGMLApr 26, 2022

One-pass additive-error subset selection for $\ell_{p}$ subspace approximation

arXiv:2204.12073v14 citationsh-index: 22
Originality Incremental advance
AI Analysis

This work provides a more efficient solution for data analysis tasks involving large datasets, though it is incremental as it builds on prior subset selection methods by extending applicability to a broader range of p values.

The paper tackles the problem of subset selection for ℓ_p subspace approximation by developing a one-pass algorithm that provides an additive approximation guarantee for any p in [1, ∞), addressing the limitation of previous methods that required multiple passes or only worked for special cases like p=2.

We consider the problem of subset selection for $\ell_{p}$ subspace approximation, that is, to efficiently find a \emph{small} subset of data points such that solving the problem optimally for this subset gives a good approximation to solving the problem optimally for the original input. Previously known subset selection algorithms based on volume sampling and adaptive sampling \cite{DeshpandeV07}, for the general case of $p \in [1, \infty)$, require multiple passes over the data. In this paper, we give a one-pass subset selection with an additive approximation guarantee for $\ell_{p}$ subspace approximation, for any $p \in [1, \infty)$. Earlier subset selection algorithms that give a one-pass multiplicative $(1+ε)$ approximation work under the special cases. Cohen \textit{et al.} \cite{CohenMM17} gives a one-pass subset section that offers multiplicative $(1+ε)$ approximation guarantee for the special case of $\ell_{2}$ subspace approximation. Mahabadi \textit{et al.} \cite{MahabadiRWZ20} gives a one-pass \emph{noisy} subset selection with $(1+ε)$ approximation guarantee for $\ell_{p}$ subspace approximation when $p \in \{1, 2\}$. Our subset selection algorithm gives a weaker, additive approximation guarantee, but it works for any $p \in [1, \infty)$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes