IT MLJul 12, 2017

Gradient Coding from Cyclic MDS Codes and Expander Graphs

Netanel Raviv, Itzhak Tamo, Rashish Tandon, Alexandros G. Dimakis

arXiv:1707.03858v321.8195 citations

Originality Incremental advance

AI Analysis

This work addresses straggler issues in distributed learning, offering incremental improvements in coding efficiency and approximate computation for faster convergence.

The paper tackles straggler mitigation in distributed learning by designing novel gradient codes using cyclic MDS codes and introducing an approximate variant that reduces computation. The results show that approximate gradient coding with expander graphs achieves close generalization error to full gradients while requiring significantly less computation, as tested on Amazon EC2.

Gradient coding is a technique for straggler mitigation in distributed learning. In this paper we design novel gradient codes using tools from classical coding theory, namely, cyclic MDS codes, which compare favorably with existing solutions, both in the applicable range of parameters and in the complexity of the involved algorithms. Second, we introduce an approximate variant of the gradient coding problem, in which we settle for approximate gradient computation instead of the exact one. This approach enables graceful degradation, i.e., the $\ell_2$ error of the approximate gradient is a decreasing function of the number of stragglers. Our main result is that normalized adjacency matrices of expander graphs yield excellent approximate gradient codes, which enable significantly less computation compared to exact gradient coding, and guarantee faster convergence than trivial solutions under standard assumptions. We experimentally test our approach on Amazon EC2, and show that the generalization error of approximate gradient coding is very close to the full gradient while requiring significantly less computation from the workers.

View on arXiv PDF

Similar