LGOct 2, 2025

Private Federated Multiclass Post-hoc Calibration

arXiv:2510.01987v1h-index: 8
Originality Incremental advance
AI Analysis

This work addresses the need for reliable calibration in privacy-sensitive domains like healthcare and finance, but it is incremental as it adapts existing methods to federated and DP contexts.

The paper tackled the problem of calibrating machine learning models in federated learning (FL) environments, where data is distributed across clients and cannot be centralized for privacy, by adapting centralized calibration methods like histogram binning and temperature scaling to FL and differential privacy (DP) settings, showing that federated temperature scaling works best with DP-FL and weighted binning is optimal without DP.

Calibrating machine learning models so that predicted probabilities better reflect the true outcome frequencies is crucial for reliable decision-making across many applications. In Federated Learning (FL), the goal is to train a global model on data which is distributed across multiple clients and cannot be centralized due to privacy concerns. FL is applied in key areas such as healthcare and finance where calibration is strongly required, yet federated private calibration has been largely overlooked. This work introduces the integration of post-hoc model calibration techniques within FL. Specifically, we transfer traditional centralized calibration methods such as histogram binning and temperature scaling into federated environments and define new methods to operate them under strong client heterogeneity. We study (1) a federated setting and (2) a user-level Differential Privacy (DP) setting and demonstrate how both federation and DP impacts calibration accuracy. We propose strategies to mitigate degradation commonly observed under heterogeneity and our findings highlight that our federated temperature scaling works best for DP-FL whereas our weighted binning approach is best when DP is not required.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes