MLDCLGJul 22, 2023

Collaboratively Learning Linear Models with Structured Missing Data

arXiv:2307.11947v16 citationsh-index: 66
Originality Incremental advance
AI Analysis

This addresses the challenge of coordinating agents with incomplete feature sets, such as in sensor networks, though it is incremental in improving distributed learning methods.

The paper tackles the problem of collaboratively learning least squares estimates across multiple agents with structured missing data, proposing a distributed algorithm called Collab that is communication-efficient and nearly asymptotically local minimax optimal without requiring labeled data sharing.

We study the problem of collaboratively learning least squares estimates for $m$ agents. Each agent observes a different subset of the features$\unicode{x2013}$e.g., containing data collected from sensors of varying resolution. Our goal is to determine how to coordinate the agents in order to produce the best estimator for each agent. We propose a distributed, semi-supervised algorithm Collab, consisting of three steps: local training, aggregation, and distribution. Our procedure does not require communicating the labeled data, making it communication efficient and useful in settings where the labeled data is inaccessible. Despite this handicap, our procedure is nearly asymptotically local minimax optimal$\unicode{x2013}$even among estimators allowed to communicate the labeled data such as imputation methods. We test our method on real and synthetic data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes