LGMay 6, 2022

Online Model Compression for Federated Learning with Large Models

arXiv:2205.03494v112 citationsh-index: 32
Originality Incremental advance
AI Analysis

This work addresses efficiency challenges for federated learning systems deploying large models, though it is incremental as it builds on existing compression techniques.

The paper tackles the high memory and communication costs of training large neural networks in federated learning by proposing Online Model Compression (OMC), which uses quantization to reduce memory usage and communication cost by up to 59% while maintaining comparable accuracy and training speed.

This paper addresses the challenges of training large neural network models under federated learning settings: high on-device memory usage and communication cost. The proposed Online Model Compression (OMC) provides a framework that stores model parameters in a compressed format and decompresses them only when needed. We use quantization as the compression method in this paper and propose three methods, (1) using per-variable transformation, (2) weight matrices only quantization, and (3) partial parameter quantization, to minimize the impact on model accuracy. According to our experiments on two recent neural networks for speech recognition and two different datasets, OMC can reduce memory usage and communication cost of model parameters by up to 59% while attaining comparable accuracy and training speed when compared with full-precision training.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes