LGMay 27, 2022

FedAvg with Fine Tuning: Local Updates Lead to Representation Learning

Liam Collins, Hamed Hassani, Aryan Mokhtari, Sanjay Shakkottai

arXiv:2205.13692v125.9140 citationsh-index: 47

Originality Incremental advance

AI Analysis

This provides theoretical insight into a widely used federated learning method, addressing a gap in understanding for researchers and practitioners, though it is incremental as it builds on existing empirical observations.

The paper tackles the problem of understanding why Federated Averaging (FedAvg) with fine-tuning generalizes well to new tasks, showing theoretically that local updates in FedAvg learn a common linear representation from diverse client data, with formal iteration complexity bounds.

The Federated Averaging (FedAvg) algorithm, which consists of alternating between a few local stochastic gradient updates at client nodes, followed by a model averaging update at the server, is perhaps the most commonly used method in Federated Learning. Notwithstanding its simplicity, several empirical studies have illustrated that the output model of FedAvg, after a few fine-tuning steps, leads to a model that generalizes well to new unseen tasks. This surprising performance of such a simple method, however, is not fully understood from a theoretical point of view. In this paper, we formally investigate this phenomenon in the multi-task linear representation setting. We show that the reason behind generalizability of the FedAvg's output is its power in learning the common data representation among the clients' tasks, by leveraging the diversity among client data distributions via local updates. We formally establish the iteration complexity required by the clients for proving such result in the setting where the underlying shared representation is a linear map. To the best of our knowledge, this is the first such result for any setting. We also provide empirical evidence demonstrating FedAvg's representation learning ability in federated image classification with heterogeneous data.

View on arXiv PDF

Similar