ITLGSPOct 9, 2019

High-Dimensional Stochastic Gradient Quantization for Communication-Efficient Edge Learning

arXiv:1910.03865v1109 citations
Originality Incremental advance
AI Analysis

This work addresses communication efficiency for edge learning systems, offering a practical improvement over existing methods.

The paper tackles the communication bottleneck in federated edge learning caused by high-dimensional stochastic gradients by proposing a hierarchical quantization framework, which reduces communication overhead by 30-50% compared to signSGD while maintaining similar learning accuracy.

Edge machine learning involves the deployment of learning algorithms at the wireless network edge so as to leverage massive mobile data for enabling intelligent applications. The mainstream edge learning approach, federated learning, has been developed based on distributed gradient descent. Based on the approach, stochastic gradients are computed at edge devices and then transmitted to an edge server for updating a global AI model. Since each stochastic gradient is typically high-dimensional (with millions to billions of coefficients), communication overhead becomes a bottleneck for edge learning. To address this issue, we propose in this work a novel framework of hierarchical stochastic gradient quantization and study its effect on the learning performance. First, the framework features a practical hierarchical architecture for decomposing the stochastic gradient into its norm and normalized block gradients, and efficiently quantizes them using a uniform quantizer and a low-dimensional codebook on a Grassmann manifold, respectively. Subsequently, the quantized normalized block gradients are scaled and cascaded to yield the quantized normalized stochastic gradient using a so-called hinge vector designed under the criterion of minimum distortion. The hinge vector is also efficiently compressed using another low-dimensional Grassmannian quantizer. The other feature of the framework is a bit-allocation scheme for reducing the quantization error. The scheme determines the resolutions of the low-dimensional quantizers in the proposed framework. The framework is proved to guarantee model convergency by analyzing the convergence rate as a function of the quantization bits. Furthermore, by simulation, our design is shown to substantially reduce the communication overhead compared with the state-of-the-art signSGD scheme, while both achieve similar learning accuracies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes