LGAIOct 11, 2024

DistDD: Distributed Data Distillation Aggregation through Gradient Matching

arXiv:2410.08665v11 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses communication inefficiencies for developers and users of federated learning systems, though it appears incremental as it builds on existing federated learning and data distillation concepts.

The paper tackles the problem of high communication costs in federated learning by introducing DistDD, a method that distills data on clients' devices to create a global distilled dataset, reducing communication needs while maintaining privacy. Experiments show it handles non-i.i.d. and mislabeled data effectively, with demonstrated communication savings in neural architecture search applications.

In this paper, we introduce DistDD, a novel approach within the federated learning framework that reduces the need for repetitive communication by distilling data directly on clients' devices. Unlike traditional federated learning that requires iterative model updates across nodes, DistDD facilitates a one-time distillation process that extracts a global distilled dataset, maintaining the privacy standards of federated learning while significantly cutting down communication costs. By leveraging the DistDD's distilled dataset, the developers of the FL can achieve just-in-time parameter tuning and neural architecture search over FL without repeating the whole FL process multiple times. We provide a detailed convergence proof of the DistDD algorithm, reinforcing its mathematical stability and reliability for practical applications. Our experiments demonstrate the effectiveness and robustness of DistDD, particularly in non-i.i.d. and mislabeled data scenarios, showcasing its potential to handle complex real-world data challenges distinctively from conventional federated learning methods. We also evaluate DistDD's application in the use case and prove its effectiveness and communication-savings in the NAS use case.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes