MEDCLGMLJan 17, 2020

Communication-Efficient Distributed Estimator for Generalized Linear Models with a Diverging Number of Covariates

arXiv:2001.06194v27 citations
Originality Incremental advance
AI Analysis

This work addresses communication efficiency in distributed data analysis for statistical modeling, offering a practical solution with relaxed server assumptions, though it is incremental in nature.

The paper tackles the problem of distributed statistical inference for generalized linear models with a diverging number of covariates, proposing a novel two-round communication method that achieves asymptotic efficiency and demonstrates satisfactory performance in simulations and a case study.

Distributed statistical inference has recently attracted immense attention. The asymptotic efficiency of the maximum likelihood estimator (MLE), the one-step MLE, and the aggregated estimating equation estimator are established for generalized linear models under the "large $n$, diverging $p_n$" framework, where the dimension of the covariates $p_n$ grows to infinity at a polynomial rate $o(n^α)$ for some $0<α<1$. Then a novel method is proposed to obtain an asymptotically efficient estimator for large-scale distributed data by two rounds of communication. In this novel method, the assumption on the number of servers is more relaxed and thus practical for real-world applications. Simulations and a case study demonstrate the satisfactory finite-sample performance of the proposed estimators.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes