DSCCMLFeb 5, 2015

Distributed Estimation of Generalized Matrix Rank: Efficient Algorithms and Lower Bounds

arXiv:1502.01403v223 citations
AI Analysis

This work addresses communication efficiency for distributed matrix computations, which is crucial for large-scale data processing, though it is incremental as it builds on existing distributed algorithms.

The paper tackles the problem of estimating the generalized matrix rank in a distributed setting, where the matrix is split across machines, and shows that any deterministic algorithm requires Ω(n²) bits of communication, but a randomized algorithm achieves only Õ(n) bits, with matching lower bounds, and demonstrates practical effectiveness through experiments.

We study the following generalized matrix rank estimation problem: given an $n \times n$ matrix and a constant $c \geq 0$, estimate the number of eigenvalues that are greater than $c$. In the distributed setting, the matrix of interest is the sum of $m$ matrices held by separate machines. We show that any deterministic algorithm solving this problem must communicate $Ω(n^2)$ bits, which is order-equivalent to transmitting the whole matrix. In contrast, we propose a randomized algorithm that communicates only $\widetilde O(n)$ bits. The upper bound is matched by an $Ω(n)$ lower bound on the randomized communication complexity. We demonstrate the practical effectiveness of the proposed algorithm with some numerical experiments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes