DCLGApr 20, 2024

Breaking the Memory Wall for Heterogeneous Federated Learning via Progressive Training

arXiv:2404.13349v218 citationsh-index: 60KDD
Originality Highly original
AI Analysis

It addresses memory limitations for heterogeneous devices in federated learning, offering a novel method to enable deployment on resource-constrained hardware.

This paper tackles memory constraints in federated learning by introducing ProFL, a progressive training framework that partitions models into blocks to reduce peak memory footprint, achieving up to 57.4% memory reduction and up to 82.4% accuracy improvement.

This paper presents ProFL, a new framework that effectively addresses the memory constraints in FL. Rather than updating the full model during local training, ProFL partitions the model into blocks based on its original architecture and trains each block in a progressive fashion. It first trains the front blocks and safely freezes them after convergence. Training of the next block is then triggered. This process progressively grows the model to be trained until the training of the full model is completed. In this way, the peak memory footprint is effectively reduced for feasible deployment on heterogeneous devices. In order to preserve the feature representation of each block, the training process is divided into two stages: model shrinking and model growing. During the model shrinking stage, we meticulously design corresponding output modules to assist each block in learning the expected feature representation and obtain the initialization model parameters. Subsequently, the obtained output modules and initialization model parameters are utilized in the corresponding model growing stage, which progressively trains the full model. Additionally, a novel metric from the scalar perspective is proposed to assess the learning status of each block, enabling us to securely freeze it after convergence and initiate the training of the next one. Finally, we theoretically prove the convergence of ProFL and conduct extensive experiments on representative models and datasets to evaluate its effectiveness. The results demonstrate that ProFL effectively reduces the peak memory footprint by up to 57.4% and improves model accuracy by up to 82.4%.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes