LG DC OCApr 16, 2025

FedCanon: Non-Convex Composite Federated Learning with Efficient Proximal Operation on Heterogeneous Data

Yuan Zhou, Jiachen Zhong, Xinli Shi, Guanghui Wen, Xinghuo Yu

arXiv:2504.11903v111.44 citationsh-index: 18IEEE Transactions on Signal Processing

Originality Highly original

AI Analysis

This addresses efficiency and robustness issues in federated learning for distributed systems, particularly under data heterogeneity, representing a novel method for a known bottleneck.

The paper tackled the problem of composite federated learning with non-convex losses and non-smooth regularization on heterogeneous data, proposing FedCanon to reduce proximal computation costs and improve accuracy, achieving sublinear and linear convergence rates and outperforming state-of-the-art methods in experiments.

Composite federated learning offers a general framework for solving machine learning problems with additional regularization terms. However, many existing methods require clients to perform multiple proximal operations to handle non-smooth terms and their performance are often susceptible to data heterogeneity. To overcome these limitations, we propose a novel composite federated learning algorithm called \textbf{FedCanon}, designed to solve the optimization problems comprising a possibly non-convex loss function and a weakly convex, potentially non-smooth regularization term. By decoupling proximal mappings from local updates, FedCanon requires only a single proximal evaluation on the server per iteration, thereby reducing the overall proximal computation cost. It also introduces control variables that incorporate global gradient information into client updates, which helps mitigate the effects of data heterogeneity. Theoretical analysis demonstrates that FedCanon achieves sublinear convergence rates under general non-convex settings and linear convergence under the Polyak-Łojasiewicz condition, without relying on bounded heterogeneity assumptions. Experiments demonstrate that FedCanon outperforms the state-of-the-art methods in terms of both accuracy and computational efficiency, particularly under heterogeneous data distributions.

View on arXiv PDF

Similar