LGDCApr 26, 2023

Federated Learning with Uncertainty-Based Client Clustering for Fleet-Wide Fault Diagnosis

arXiv:2304.13275v121 citationsh-index: 39Has Code
Originality Incremental advance
AI Analysis

This work addresses data scarcity and heterogeneity issues for operators in industrial monitoring, though it appears incremental as it builds on existing FL methods with a novel clustering approach.

The paper tackles the problem of client dataset heterogeneity in federated learning for fault diagnosis by proposing a clustering-based FL algorithm that groups clients based on dataset similarity, measured via prediction accuracy and uncertainty on local test sets without data sharing.

Operators from various industries have been pushing the adoption of wireless sensing nodes for industrial monitoring, and such efforts have produced sizeable condition monitoring datasets that can be used to build diagnosis algorithms capable of warning maintenance engineers of impending failure or identifying current system health conditions. However, single operators may not have sufficiently large fleets of systems or component units to collect sufficient data to develop data-driven algorithms. Collecting a satisfactory quantity of fault patterns for safety-critical systems is particularly difficult due to the rarity of faults. Federated learning (FL) has emerged as a promising solution to leverage datasets from multiple operators to train a decentralized asset fault diagnosis model while maintaining data confidentiality. However, there are still considerable obstacles to overcome when it comes to optimizing the federation strategy without leaking sensitive data and addressing the issue of client dataset heterogeneity. This is particularly prevalent in fault diagnosis applications due to the high diversity of operating conditions and system configurations. To address these two challenges, we propose a novel clustering-based FL algorithm where clients are clustered for federating based on dataset similarity. To quantify dataset similarity between clients without explicitly sharing data, each client sets aside a local test dataset and evaluates the other clients' model prediction accuracy and uncertainty on this test dataset. Clients are then clustered for FL based on relative prediction accuracy and uncertainty.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes