LGDec 31, 2020

Federated Nonconvex Sparse Learning

Qianqian Tong, Guannan Liang, Tan Zhu, Jinbo Bi

arXiv:2101.00052v19.914 citations

Originality Incremental advance

AI Analysis

This work provides theoretical guarantees and practical algorithms for federated nonconvex sparse learning, which is critical for privacy-preserving and efficient model training on distributed, non-IID data for practitioners in signal processing and deep network compression.

This paper addresses nonconvex sparse learning in federated settings, proposing two iterative hard thresholding (IHT) methods, Fed-HT and FedIter-HT. These methods achieve linear convergence rates and recover optimal sparse estimators with decentralized non-IID data, outperforming a distributed IHT competitor in objective value decrease with fewer communication rounds and less bandwidth.

Nonconvex sparse learning plays an essential role in many areas, such as signal processing and deep network compression. Iterative hard thresholding (IHT) methods are the state-of-the-art for nonconvex sparse learning due to their capability of recovering true support and scalability with large datasets. Theoretical analysis of IHT is currently based on centralized IID data. In realistic large-scale situations, however, data are distributed, hardly IID, and private to local edge computing devices. It is thus necessary to examine the property of IHT in federated settings, which update in parallel on local devices and communicate with a central server only once in a while without sharing local data. In this paper, we propose two IHT methods: Federated Hard Thresholding (Fed-HT) and Federated Iterative Hard Thresholding (FedIter-HT). We prove that both algorithms enjoy a linear convergence rate and have strong guarantees to recover the optimal sparse estimator, similar to traditional IHT methods, but now with decentralized non-IID data. Empirical results demonstrate that the Fed-HT and FedIter-HT outperform their competitor - a distributed IHT, in terms of decreasing the objective values with lower requirements on communication rounds and bandwidth.

View on arXiv PDF

Similar