DC NA NAApr 17, 2018

Coded Sparse Matrix Multiplication

arXiv:1802.03430134 citationsh-index: 74

AI Analysis

For distributed machine learning systems, this work addresses the straggler problem while preserving sparsity, reducing decoding time from O(rt) to O(nnz(C)).

The paper proposes a coded sparse matrix multiplication scheme that achieves near-optimal recovery threshold, low computation overhead, and linear decoding time O(nnz(C)), outperforming uncoded and existing coded strategies in large-scale distributed settings.

In a large-scale and distributed matrix multiplication problem $C=A^{\intercal}B$, where $C\in\mathbb{R}^{r\times t}$, the coded computation plays an important role to effectively deal with "stragglers" (distributed computations that may get delayed due to few slow or faulty processors). However, existing coded schemes could destroy the significant sparsity that exists in large-scale machine learning problems, and could result in much higher computation overhead, i.e., $O(rt)$ decoding time. In this paper, we develop a new coded computation strategy, we call \emph{sparse code}, which achieves near \emph{optimal recovery threshold}, \emph{low computation overhead}, and \emph{linear decoding time} $O(nnz(C))$. We implement our scheme and demonstrate the advantage of the approach over both uncoded and current fastest coded strategies.

View on arXiv PDF

Similar