ML LGOct 26, 2024

Certifiably Robust Model Evaluation in Federated Learning under Meta-Distributional Shifts

Amir Najafi, Samin Mahdizadeh Sani, Farzan Farnia

arXiv:2410.20250v2h-index: 15Has CodeICML

Originality Incremental advance

AI Analysis

This provides robust evaluation methods for federated learning systems facing distribution shifts, addressing a critical need for reliability in decentralized AI applications, though it is incremental in extending existing statistical bounds.

The paper tackles the problem of certifying a federated learning model's performance on an unseen target network with bounded distributional shifts, deriving worst-case guarantees for average loss and risk CDF that are efficiently computable and asymptotically optimal, with empirical validation on real-world tasks.

We address the challenge of certifying the performance of a federated learning model on an unseen target network using only measurements from the source network that trained the model. Specifically, consider a source network "A" with $K$ clients, each holding private, non-IID datasets drawn from heterogeneous distributions, modeled as samples from a broader meta-distribution $μ$. Our goal is to provide certified guarantees for the model's performance on a different, unseen network "B", governed by an unknown meta-distribution $μ'$, assuming the deviation between $μ$ and $μ'$ is bounded either in Wasserstein distance or an $f$-divergence. We derive worst-case uniform guarantees for both the model's average loss and its risk CDF, the latter corresponding to a novel, adversarially robust version of the Dvoretzky-Kiefer-Wolfowitz (DKW) inequality. In addition, we show how the vanilla DKW bound enables principled certification of the model's true performance on unseen clients within the same (source) network. Our bounds are efficiently computable, asymptotically minimax optimal, and preserve clients' privacy. We also establish non-asymptotic generalization bounds that converge to zero as $K$ grows and the minimum per-client sample size exceeds $\mathcal{O}(\log K)$. Empirical evaluations confirm the practical utility of our bounds across real-world tasks. The project code is available at: github.com/samin-mehdizadeh/Robust-Evaluation-DKW

View on arXiv PDF Code

Similar