MLDCLGSYCONov 17, 2015

Extending Gossip Algorithms to Distributed Estimation of U-Statistics

arXiv:1511.05464v116 citations
Originality Incremental advance
AI Analysis

This work addresses the need for efficient algorithms to compute U-statistics, such as Area Under the Curve and empirical variance, in distributed systems, representing an incremental advancement over existing gossip methods.

The paper tackles the problem of distributed estimation of U-statistics in decentralized networks, proposing new synchronous and asynchronous gossip algorithms that achieve convergence rates of O(1/t) and O(log t/t), respectively, and outperform prior methods in numerical experiments.

Efficient and robust algorithms for decentralized estimation in networks are essential to many distributed systems. Whereas distributed estimation of sample mean statistics has been the subject of a good deal of attention, computation of $U$-statistics, relying on more expensive averaging over pairs of observations, is a less investigated area. Yet, such data functionals are essential to describe global properties of a statistical population, with important examples including Area Under the Curve, empirical variance, Gini mean difference and within-cluster point scatter. This paper proposes new synchronous and asynchronous randomized gossip algorithms which simultaneously propagate data across the network and maintain local estimates of the $U$-statistic of interest. We establish convergence rate bounds of $O(1/t)$ and $O(\log t / t)$ for the synchronous and asynchronous cases respectively, where $t$ is the number of iterations, with explicit data and network dependent terms. Beyond favorable comparisons in terms of rate analysis, numerical experiments provide empirical evidence the proposed algorithms surpasses the previously introduced approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes