LGDCJul 2, 2024

Towards Federated Learning with On-device Training and Communication in 8-bit Floating Point

arXiv:2407.02610v21 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses communication bottlenecks in federated learning for edge devices, though it is incremental as it builds on existing FP8 training methods.

The paper tackles the problem of reducing communication costs in federated learning by using 8-bit floating point (FP8) for on-device training, achieving at least 2.9x communication reduction while maintaining model accuracy compared to an FP32 baseline.

Recent work has shown that 8-bit floating point (FP8) can be used for efficiently training neural networks with reduced computational cost compared to training in FP32/FP16. In this work, we investigate the use of FP8 training in a federated learning context. This approach brings not only the usual benefits of FP8 which are desirable for on-device training at the edge, but also reduces client-server communication costs due to significant weight compression. We present a novel method for combining FP8 client training while maintaining a global FP32 server model and provide convergence analysis. Experiments with various machine learning models and datasets show that our method consistently yields communication reductions of at least 2.9x across a variety of tasks and models compared to an FP32 baseline to achieve the same trained model accuracy.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes