IT LG SP MLFeb 11, 2023

Communication and Storage Efficient Federated Split Learning

arXiv:2302.05599v110.325 citationsh-index: 15

Originality Incremental advance

AI Analysis

This addresses efficiency issues in distributed machine learning for edge computing, though it is incremental as it builds on existing FSL methods.

The paper tackles the high communication overhead and server storage requirements in Federated Split Learning by proposing CSE-FSL, which uses an auxiliary network to reduce gradient transmission and maintain a single server model, achieving significant communication reduction and state-of-the-art accuracy in real-world tasks.

Federated learning (FL) is a popular distributed machine learning (ML) paradigm, but is often limited by significant communication costs and edge device computation capabilities. Federated Split Learning (FSL) preserves the parallel model training principle of FL, with a reduced device computation requirement thanks to splitting the ML model between the server and clients. However, FSL still incurs very high communication overhead due to transmitting the smashed data and gradients between the clients and the server in each global round. Furthermore, the server has to maintain separate models for every client, resulting in a significant computation and storage requirement that grows linearly with the number of clients. This paper tries to solve these two issues by proposing a communication and storage efficient federated and split learning (CSE-FSL) strategy, which utilizes an auxiliary network to locally update the client models while keeping only a single model at the server, hence avoiding the communication of gradients from the server and greatly reducing the server resource requirement. Communication cost is further reduced by only sending the smashed data in selected epochs from the clients. We provide a rigorous theoretical analysis of CSE-FSL that guarantees its convergence for non-convex loss functions. Extensive experimental results demonstrate that CSE-FSL has a significant communication reduction over existing FSL techniques while achieving state-of-the-art convergence and model accuracy, using several real-world FL tasks.

View on arXiv PDF

Similar