LGCGDCDSOct 26, 2022

Coresets for Vertical Federated Learning: Regularized Linear Regression and $K$-Means Clustering

arXiv:2210.14664v125 citationsh-index: 20
Originality Incremental advance
AI Analysis

This addresses communication bottlenecks for distributed data features in federated learning, though it is incremental as it applies existing coreset ideas to VFL.

The paper tackles the high communication complexity in vertical federated learning by proposing a coreset framework, showing it drastically reduces communication while maintaining solution quality for regularized linear regression and k-means clustering.

Vertical federated learning (VFL), where data features are stored in multiple parties distributively, is an important area in machine learning. However, the communication complexity for VFL is typically very high. In this paper, we propose a unified framework by constructing coresets in a distributed fashion for communication-efficient VFL. We study two important learning tasks in the VFL setting: regularized linear regression and $k$-means clustering, and apply our coreset framework to both problems. We theoretically show that using coresets can drastically alleviate the communication complexity, while nearly maintain the solution quality. Numerical experiments are conducted to corroborate our theoretical findings.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes