A Communication Efficient Collaborative Learning Framework for Distributed Features
This addresses communication bottlenecks for parties with distributed data in privacy-preserving collaborative learning, though it is incremental as it builds on existing federated learning methods.
The paper tackles the problem of communication inefficiency in collaborative learning with distributed features by proposing a Federated Stochastic Block Coordinate Descent (FedBCD) algorithm, which reduces communication rounds to O(√T) and achieves O(1/√T) accuracy in gradient norm squared.
We introduce a collaborative learning framework allowing multiple parties having different sets of attributes about the same user to jointly build models without exposing their raw data or model parameters. In particular, we propose a Federated Stochastic Block Coordinate Descent (FedBCD) algorithm, in which each party conducts multiple local updates before each communication to effectively reduce the number of communication rounds among parties, a principal bottleneck for collaborative learning problems. We analyze theoretically the impact of the number of local updates and show that when the batch size, sample size, and the local iterations are selected appropriately, within $T$ iterations, the algorithm performs $\mathcal{O}(\sqrt{T})$ communication rounds and achieves some $\mathcal{O}(1/\sqrt{T})$ accuracy (measured by the average of the gradient norm squared). The approach is supported by our empirical evaluations on a variety of tasks and datasets, demonstrating advantages over stochastic gradient descent (SGD) approaches.