MLDCITLGMay 22, 2018

Robust Gradient Descent via Moment Encoding with LDPC Codes

arXiv:1805.08327v241 citations
Originality Incremental advance
AI Analysis

This addresses performance bottlenecks in distributed machine learning for large-scale computing systems, offering an incremental improvement over prior encoding methods.

The paper tackles the problem of straggling processors in distributed gradient descent by encoding the second-moment of data with LDPC codes, which reduces computational overhead and automatically adjusts to straggler numbers, outperforming existing schemes in real setups.

This paper considers the problem of implementing large-scale gradient descent algorithms in a distributed computing setting in the presence of {\em straggling} processors. To mitigate the effect of the stragglers, it has been previously proposed to encode the data with an erasure-correcting code and decode at the master server at the end of the computation. We, instead, propose to encode the second-moment of the data with a low density parity-check (LDPC) code. The iterative decoding algorithms for LDPC codes have very low computational overhead and the number of decoding iterations can be made to automatically adjust with the number of stragglers in the system. We show that for a random model for stragglers, the proposed moment encoding based gradient descent method can be viewed as the stochastic gradient descent method. This allows us to obtain convergence guarantees for the proposed solution. Furthermore, the proposed moment encoding based method is shown to outperform the existing schemes in a real distributed computing setup.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes