SPJul 20, 2024
EEGMamba: Bidirectional State Space Model with Mixture of Experts for EEG Multi-task ClassificationYiyu Gui, MingZhi Chen, Yuqi Su et al.
In recent years, with the development of deep learning, electroencephalogram (EEG) classification networks have achieved certain progress. Transformer-based models can perform well in capturing long-term dependencies in EEG signals. However, their quadratic computational complexity poses a substantial computational challenge. Moreover, most EEG classification models are only suitable for single tasks and struggle with generalization across different tasks, particularly when faced with variations in signal length and channel count. In this paper, we introduce EEGMamba, the first universal EEG classification network to truly implement multi-task learning for EEG applications. EEGMamba seamlessly integrates the Spatio-Temporal-Adaptive (ST-Adaptive) module, bidirectional Mamba, and Mixture of Experts (MoE) into a unified framework. The proposed ST-Adaptive module performs unified feature extraction on EEG signals of different lengths and channel counts through spatial-adaptive convolution and incorporates a class token to achieve temporal-adaptability. Moreover, we design a bidirectional Mamba particularly suitable for EEG signals for further feature extraction, balancing high accuracy, fast inference speed, and efficient memory-usage in processing long EEG signals. To enhance the processing of EEG data across multiple tasks, we introduce task-aware MoE with a universal expert, effectively capturing both differences and commonalities among EEG data from different tasks. We evaluate our model on eight publicly available EEG datasets, and the experimental results demonstrate its superior performance in four types of tasks: seizure detection, emotion recognition, sleep stage classification, and motor imagery. The code is set to be released soon.
SPJul 20, 2024
Improving EEG Classification Through Randomly Reassembling Original and Generated Data with Transformer-based Diffusion ModelsMingzhi Chen, Yiyu Gui, Yuqi Su et al.
Electroencephalogram (EEG) classification has been widely used in various medical and engineering applications, where it is important for understanding brain function, diagnosing diseases, and assessing mental health conditions. However, the scarcity of EEG data severely restricts the performance of EEG classification networks, and generative model-based data augmentation methods have emerged as potential solutions to overcome this challenge. There are two problems with existing methods: (1) The quality of the generated EEG signals is not high; (2) The enhancement of EEG classification networks is not effective. In this paper, we propose a Transformer-based denoising diffusion probabilistic model and a generated data-based augmentation method to address the above two problems. For the characteristics of EEG signals, we propose a constant-factor scaling method to preprocess the signals, which reduces the loss of information. We incorporated Multi-Scale Convolution and Dynamic Fourier Spectrum Information modules into the model, improving the stability of the training process and the quality of the generated data. The proposed augmentation method randomly reassemble the generated data with original data in the time-domain to obtain vicinal data, which improves the model performance by minimizing the empirical risk and the vicinal risk. We verify the proposed augmentation method on four EEG datasets for four tasks and observe significant accuracy performance improvements: 14.00% on the Bonn dataset; 6.38% on the SleepEDF-20 dataset; 9.42% on the FACED dataset; 2.5% on the Shu dataset. We will make the code of our method publicly accessible soon.
0.9LGApr 11
A Multi-head Attention Fusion Network for Industrial Prognostics under Discrete Operational ConditionsYuqi Su, Xiaolei Fang
Complex systems such as aircraft engines, turbines, and industrial machinery often operate under dynamically changing conditions. These varying operating conditions can substantially influence degradation behavior and make prognostic modeling more challenging, as accurate prediction requires explicit consideration of operational effects. To address this issue, this paper proposes a novel multi-head attention-based fusion neural network. The proposed framework explicitly models and integrates three signal components: (1) the monotonic degradation trend, which reflects the underlying deterioration of the system; (2) discrete operating states, identified through clustering and encoded into dense embeddings; and (3) residual random noise, which captures unexplained variation in sensor measurements. The core strength of the framework lies in its architecture, which combines BiLSTM networks with attention mechanisms to better capture complex temporal dependencies. The attention mechanism allows the model to adaptively weight different time steps and sensor signals, improving its ability to extract prognostically relevant information. In addition, a fusion module is designed to integrate the outputs from the degradation-trend branch and the operating-state embeddings, enabling the model to capture their interactions more effectively. The proposed method is validated using a dataset from the NASA repository, and the results demonstrate its effectiveness.
LGDec 11, 2023
Federated Multilinear Principal Component Analysis with Applications in PrognosticsChengyu Zhou, Yuqi Su, Tangbin Xia et al.
Multilinear Principal Component Analysis (MPCA) is a widely utilized method for the dimension reduction of tensor data. However, the integration of MPCA into federated learning remains unexplored in existing research. To tackle this gap, this article proposes a Federated Multilinear Principal Component Analysis (FMPCA) method, which enables multiple users to collaboratively reduce the dimension of their tensor data while keeping each user's data local and confidential. The proposed FMPCA method is guaranteed to have the same performance as traditional MPCA. An application of the proposed FMPCA in industrial prognostics is also demonstrated. Simulated data and a real-world data set are used to validate the performance of the proposed method.
MLOct 14, 2024
A Two-Stage Federated Learning Approach for Industrial Prognostics Using Large-Scale High-Dimensional SignalsYuqi Su, Xiaolei Fang
Industrial prognostics aims to develop data-driven methods that leverage high-dimensional degradation signals from assets to predict their failure times. The success of these models largely depends on the availability of substantial historical data for training. However, in practice, individual organizations often lack sufficient data to independently train reliable prognostic models, and privacy concerns prevent data sharing between organizations for collaborative model training. To overcome these challenges, this article proposes a statistical learning-based federated model that enables multiple organizations to jointly train a prognostic model while keeping their data local and secure. The proposed approach involves two key stages: federated dimension reduction and federated (log)-location-scale regression. In the first stage, we develop a federated randomized singular value decomposition algorithm for multivariate functional principal component analysis, which efficiently reduces the dimensionality of degradation signals while maintaining data privacy. The second stage proposes a federated parameter estimation algorithm for (log)-location-scale regression, allowing organizations to collaboratively estimate failure time distributions without sharing raw data. The proposed approach addresses the limitations of existing federated prognostic methods by using statistical learning techniques that perform well with smaller datasets and provide comprehensive failure time distributions. The effectiveness and practicality of the proposed model are validated using simulated data and a dataset from the NASA repository.
LGMay 9, 2024
Deep Learning-Based Residual Useful Lifetime Prediction for Assets with Uncertain Failure ModesYuqi Su, Xiaolei Fang
Industrial prognostics focuses on utilizing degradation signals to forecast and continually update the residual useful life of complex engineering systems. However, existing prognostic models for systems with multiple failure modes face several challenges in real-world applications, including overlapping degradation signals from multiple components, the presence of unlabeled historical data, and the similarity of signals across different failure modes. To tackle these issues, this research introduces two prognostic models that integrate the mixture (log)-location-scale distribution with deep learning. This integration facilitates the modeling of overlapping degradation signals, eliminates the need for explicit failure mode identification, and utilizes deep learning to capture complex nonlinear relationships between degradation signals and residual useful lifetimes. Numerical studies validate the superior performance of these proposed models compared to existing methods.