Scalable Decentralized Algorithms for Online Personalized Mean Estimation
This work addresses scalability in collaborative learning for agents with differing data distributions, though it is incremental as it simplifies the broader problem to mean estimation.
The paper tackles the problem of scalable decentralized online personalized mean estimation for agents with insufficient local data, proposing two algorithms with complexities O(r|A|log|A|) and O(r|A|) that achieve asymptotically optimal estimates under certain conditions.
In numerous settings, agents lack sufficient data to directly learn a model. Collaborating with other agents may help, but it introduces a bias-variance trade-off, when local data distributions differ. A key challenge is for each agent to identify clients with similar distributions while learning the model, a problem that remains largely unresolved. This study focuses on a simplified version of the overarching problem, where each agent collects samples from a real-valued distribution over time to estimate its mean. Existing algorithms face impractical space and time complexities (quadratic in the number of agents A). To address scalability challenges, we propose a framework where agents self-organize into a graph, allowing each agent to communicate with only a selected number of peers r. We introduce two collaborative mean estimation algorithms: one draws inspiration from belief propagation, while the other employs a consensus-based approach, with complexity of O( r |A| log |A|) and O(r |A|), respectively. We establish conditions under which both algorithms yield asymptotically optimal estimates and offer a theoretical characterization of their performance.