LGApr 2, 2025

UniFault: A Fault Diagnosis Foundation Model from Bearing Data

Emadeldeen Eldele, Mohamed Ragab, Xu Qing, Edward, Zhenghua Chen, Min Wu, Xiaoli Li, Jay Lee

arXiv:2504.01373v24 citationsh-index: 25

Originality Incremental advance

AI Analysis

This addresses the problem of scalable predictive maintenance for industrial applications, though it appears incremental as an adaptation of foundation model concepts to a specific domain.

The paper tackles the problem of limited generalization in machine fault diagnosis models by introducing UniFault, a foundation model that addresses data heterogeneity through a data harmonization pipeline with unification and cross-domain temporal fusion. The model, pretrained on 6.9 million samples, achieves state-of-the-art performance in few-shot learning on real-world datasets.

Machine fault diagnosis (FD) is a critical task for predictive maintenance, enabling early fault detection and preventing unexpected failures. Despite its importance, existing FD models are operation-specific with limited generalization across diverse datasets. Foundation models (FM) have demonstrated remarkable potential in both visual and language domains, achieving impressive generalization capabilities even with minimal data through few-shot or zero-shot learning. However, translating these advances to FD presents unique hurdles. Unlike the large-scale, cohesive datasets available for images and text, FD datasets are typically smaller and more heterogeneous, with significant variations in sampling frequencies and the number of channels across different systems and applications. This heterogeneity complicates the design of a universal architecture capable of effectively processing such diverse data while maintaining robust feature extraction and learning capabilities. In this paper, we introduce UniFault, a foundation model for fault diagnosis that systematically addresses these issues. Specifically, the model incorporates a comprehensive data harmonization pipeline featuring two key innovations. First, a unification scheme transforms multivariate inputs into standardized univariate sequences. Second, a novel cross-domain temporal fusion strategy mitigates distribution shifts and enriches sample diversity and count, improving the model generalization across varying conditions. UniFault is pretrained on over 6.9 million samples spanning diverse FD datasets, enabling superior few-shot performance. Extensive experiments on real-world FD datasets demonstrate that UniFault achieves state-of-the-art performance, setting a new benchmark for fault diagnosis models and paving the way for more scalable and robust predictive maintenance solutions.

View on arXiv PDF

Similar