Approximate Distributed Coded Computing: Polynomial Codes and Randomized Sketching
For practitioners of large-scale distributed machine learning, this work provides a survey and combination of two techniques to mitigate straggler effects, but it is primarily a review with no new empirical results.
This paper reviews the foundations of coded computing and randomized numerical linear algebra, and presents distributed schemes that combine both to speed up optimization and machine learning algorithms in the presence of slow or non-responsive servers.
Coded computing is a distributed paradigm that uses coding theory to introduce \textit{redundancy} and overcome bottlenecks in large-scale systems. In the same vein, randomized numerical linear algebra employs probabilistic methods to \textit{compress} and accelerate linear algebraic operations, addressing challenges in high-dimensional data analysis. This article reviews the foundations of both fields and presents distributed schemes that combine techniques from both to speed up optimization and machine learning algorithms, in the presence of slow or non-responsive servers. Along the way, we touch on various related topics and mathematical concepts.