LGDec 27, 2021

AET-SGD: Asynchronous Event-triggered Stochastic Gradient Descent

arXiv:2112.13935v12 citations
Originality Incremental advance
AI Analysis

This addresses communication inefficiencies in distributed machine learning, offering a novel method for reducing costs and handling delays, though it is incremental over existing event-triggered approaches.

The paper tackles the communication bottleneck in distributed learning by proposing AET-SGD, an asynchronous event-triggered SGD framework that reduces communication cost by 44x to 120x compared to state-of-the-art methods while maintaining good convergence performance and mitigating delay impacts.

Communication cost is the main bottleneck for the design of effective distributed learning algorithms. Recently, event-triggered techniques have been proposed to reduce the exchanged information among compute nodes and thus alleviate the communication cost. However, most existing event-triggered approaches only consider heuristic event-triggered thresholds. They also ignore the impact of computation and network delay, which play an important role on the training performance. In this paper, we propose an Asynchronous Event-triggered Stochastic Gradient Descent (SGD) framework, called AET-SGD, to i) reduce the communication cost among the compute nodes, and ii) mitigate the impact of the delay. Compared with baseline event-triggered methods, AET-SGD employs a linear increasing sample size event-triggered threshold, and can significantly reduce the communication cost while keeping good convergence performance. We implement AET-SGD and evaluate its performance on multiple representative data sets, including MNIST, FashionMNIST, KMNIST and CIFAR10. The experimental results validate the correctness of the design and show a significant communication cost reduction from 44x to 120x, compared to the state of the art. Our results also show that AET-SGD can resist large delay from the straggler nodes while obtaining a decent performance and a desired speedup ratio.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes