ITMLJul 12, 2017

Gradient Coding from Cyclic MDS Codes and Expander Graphs

arXiv:1707.03858v3195 citations
Originality Incremental advance
AI Analysis

This work addresses straggler issues in distributed learning, offering incremental improvements in coding efficiency and approximate computation for faster convergence.

The paper tackles straggler mitigation in distributed learning by designing novel gradient codes using cyclic MDS codes and introducing an approximate variant that reduces computation. The results show that approximate gradient coding with expander graphs achieves close generalization error to full gradients while requiring significantly less computation, as tested on Amazon EC2.

Gradient coding is a technique for straggler mitigation in distributed learning. In this paper we design novel gradient codes using tools from classical coding theory, namely, cyclic MDS codes, which compare favorably with existing solutions, both in the applicable range of parameters and in the complexity of the involved algorithms. Second, we introduce an approximate variant of the gradient coding problem, in which we settle for approximate gradient computation instead of the exact one. This approach enables graceful degradation, i.e., the $\ell_2$ error of the approximate gradient is a decreasing function of the number of stragglers. Our main result is that normalized adjacency matrices of expander graphs yield excellent approximate gradient codes, which enable significantly less computation compared to exact gradient coding, and guarantee faster convergence than trivial solutions under standard assumptions. We experimentally test our approach on Amazon EC2, and show that the generalization error of approximate gradient coding is very close to the full gradient while requiring significantly less computation from the workers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes