Silent Failures in Federated Personalization of Foundation Models
For researchers and practitioners deploying personalized foundation models via federated learning, this paper highlights a critical gap in trustworthiness evaluation and proposes a research agenda to address it.
The paper identifies a new class of trustworthiness failures called 'Silent Failures' in federated personalization of foundation models, including amplified bias, fairness collapse, and alignment erosion, which are hard to detect due to privacy constraints. It introduces a taxonomy of six failure modes and argues that privacy-preserving training alone is insufficient for trustworthy deployment.
Foundation models are increasingly personalized on decentralized private data through federated learning and are now deployed at scale under growing regulatory requirements for post-market monitoring. We argue that this convergence creates a distinct and under-recognized class of trustworthiness failures, which we term "Silent Failures." These include amplified bias, fairness collapse, and alignment erosion that may remain difficult to detect because federated learning's privacy constraints limit visibility into model behavior. A landscape analysis of existing benchmarks reveals a structural divide. Federated benchmarks evaluate system performance but provide limited insight into model behavior, whereas centralized trustworthiness benchmarks assess behavior but require model access incompatible with federated privacy. We introduce a taxonomy of six silent failure modes arising from the interaction of foundation model personalization, dataset shift, and core federated constraints. Our analysis shows that privacy-preserving training alone is insufficient for trustworthy deployment. We conclude with a research agenda for privacy-preserving behavioral evaluation and propose that silent failures become a standard diagnostic category for trustworthy federated artificial intelligence.