NA LGJan 28, 2016

Distributed Low Rank Approximation of Implicit Functions of a Matrix

arXiv:1601.07721v19.233 citations

Originality Incremental advance

AI Analysis

This work addresses efficient distributed computation for machine learning tasks involving implicit matrix functions, offering practical solutions with proven communication bounds, though it is incremental in extending low rank approximation techniques to distributed settings.

The paper tackles the problem of distributed low rank approximation for matrices implicitly represented across servers, such as entrywise functions of aggregated matrices, by developing protocols that achieve additive error guarantees with communication cost scaling as d·(sk/ε)^O(1). The result includes applications to softmax, Gaussian kernels, and robust estimators, with experimental validation on real datasets.

We study distributed low rank approximation in which the matrix to be approximated is only implicitly represented across the different servers. For example, each of $s$ servers may have an $n \times d$ matrix $A^t$, and we may be interested in computing a low rank approximation to $A = f(\sum_{t=1}^s A^t)$, where $f$ is a function which is applied entrywise to the matrix $\sum_{t=1}^s A^t$. We show for a wide class of functions $f$ it is possible to efficiently compute a $d \times d$ rank-$k$ projection matrix $P$ for which $\|A - AP\|_F^2 \leq \|A - [A]_k\|_F^2 + \varepsilon \|A\|_F^2$, where $AP$ denotes the projection of $A$ onto the row span of $P$, and $[A]_k$ denotes the best rank-$k$ approximation to $A$ given by the singular value decomposition. The communication cost of our protocols is $d \cdot (sk/\varepsilon)^{O(1)}$, and they succeed with high probability. Our framework allows us to efficiently compute a low rank approximation to an entry-wise softmax, to a Gaussian kernel expansion, and to $M$-Estimators applied entrywise (i.e., forms of robust low rank approximation). We also show that our additive error approximation is best possible, in the sense that any protocol achieving relative error for these problems requires significantly more communication. Finally, we experimentally validate our algorithms on real datasets.

View on arXiv PDF

Similar