Modulated learning for private and distributed regression with just a single sample per client device

Praneeth Vepakomma, Amirhossein Reisizadeh, Samuel Horváth, Munther Dahleh

arXiv:2605.0723331.2

AI Analysis

For federated learning practitioners, this work solves the one-sample-per-client problem, enabling privacy-preserving collaborative learning from devices with extremely limited data.

This work addresses the challenge of learning from a large number of devices where each device has only a single data sample, a scenario where standard federated learning fails. The proposed method injects a single calibrated noisy perturbation per client, enabling unbiased gradient estimation that matches non-private centralized gradients while preserving privacy, achieving accurate models without large local datasets.

This work focuses on the question of learning from a large number of devices with each device holding only a single sample of data. Several real-world applications exist to this one sample per client setup up including learning from fitness trackers, data/app usage aggregators, body-worn sensing devices, and daily event monitors to name a few. When a client has only one sample, the standard federated learning paradigm breaks down as a local update based on that single point is far from being useful, especially in the earlier rounds for estimation of the model coefficients. This utility is further weakened by the privacy-inducing noise applied at every round. This work caters to this problem to enable such clients to collaboratively contribute to effectively learn a global model without leaking the privacy of their data. The proposed approach injects a single, carefully calibrated noisy perturbation to transform the sample at each client, followed by a post-processed representation which is shared with the server. These representations aggregated at the server are processed to obtain an unbiased gradient update that in expectation matches the non-private centralized gradient while preserving data privacy. This approach is different than traditional private federated learning, where the communication payloads involve model coefficients as opposed to privately transformed data samples. This method enables devices with extremely limited data to collaborate and learn accurate, privacy-preserving models without requiring large local datasets or sacrificing individual privacy.

View on arXiv PDF

Similar