CRAILGMay 23, 2022

LIA: Privacy-Preserving Data Quality Evaluation in Federated Learning Using a Lazy Influence Approximation

arXiv:2205.11518v43 citationsh-index: 61
Originality Incremental advance
AI Analysis

This addresses privacy concerns in data valuation for Federated Learning, offering a practical solution for filtering corrupted data, though it is incremental as it builds on existing influence approximation techniques.

The paper tackles the problem of handling low-quality or malicious data in Federated Learning by proposing a privacy-preserving data quality evaluation method using lazy influence approximation, achieving recall rates over 90% (up to 100%) with strong differential privacy guarantees (ε ≤ 1).

In Federated Learning, it is crucial to handle low-quality, corrupted, or malicious data. However, traditional data valuation methods are not suitable due to privacy concerns. To address this, we propose a simple yet effective approach that utilizes a new influence approximation called "lazy influence" to filter and score data while preserving privacy. To do this, each participant uses their own data to estimate the influence of another participant's batch and sends a differentially private obfuscated score to the central coordinator. Our method has been shown to successfully filter out biased and corrupted data in various simulated and real-world settings, achieving a recall rate of over $>90\%$ (sometimes up to $100\%$) while maintaining strong differential privacy guarantees with $\varepsilon \leq 1$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes