Subset selection for matrices by column exchange
For researchers needing efficient subset selection from large matrices, this offers a faster algorithm with guaranteed bounds.
The paper proposes a faster greedy algorithm for selecting a submatrix that provides a good representation of the whole matrix, achieving asymptotic speedup for n >> m while maintaining the same quality bounds. The algorithm performs column exchanges and a new upper bound on the number of exchanges is proven.
The paper considers the problem of finding a submatrix $X_{\mathcal{S}} \in \mathbb{R}^{m \times k}$ in a matrix $X \in \mathbb{R}^{m \times n}$, such that the spectral or Frobenius norm of $X_{\mathcal{S}}^† X$ is limited, which guarantees it provides a good representation of the whole matrix. Such bounds can be reached by applying greedy algorithms, maximizing the submatrix volume. We suggest a modification of a greedy volume maximization, which performs column exchanges asymptotically faster for $n \gg m$ than the known alternatives, while guaranteeing the same bounds on $X_{\mathcal{S}}^† X$. In addition, we prove a new upper bound on the number of required exchanges, which is applicable to the new algorithm as well as to other greedy volume maximization algorithms.