ML LGJul 1, 2024

Evaluating the Role of Data Enrichment Approaches Towards Rare Event Analysis in Manufacturing

Chathurangi Shyalika, Ruwan Wickramarachchi, Fadi El Kalach, Ramy Harik, Amit Sheth

arXiv:2407.01644v17.512 citationsh-index: 29

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of data imbalance in manufacturing for predicting rare events, which is important for reducing downtime and costs, but it is incremental as it applies existing enrichment methods to this domain.

This paper tackles the problem of predicting rare events in manufacturing, which cause unplanned downtime and high energy consumption, by evaluating data enrichment techniques combined with supervised machine learning; the results show that enrichment enhances up to 48% of the F1 measure in rare failure event detection and prediction.

Rare events are occurrences that take place with a significantly lower frequency than more common regular events. In manufacturing, predicting such events is particularly important, as they lead to unplanned downtime, shortening equipment lifespan, and high energy consumption. The occurrence of events is considered frequently-rare if observed in more than 10% of all instances, very-rare if it is 1-5%, moderately-rare if it is 5-10%, and extremely-rare if less than 1%. The rarity of events is inversely correlated with the maturity of a manufacturing industry. Typically, the rarity of events affects the multivariate data generated within a manufacturing process to be highly imbalanced, which leads to bias in predictive models. This paper evaluates the role of data enrichment techniques combined with supervised machine-learning techniques for rare event detection and prediction. To address the data scarcity, we use time series data augmentation and sampling methods to amplify the dataset with more multivariate features and data points while preserving the underlying time series patterns in the combined alterations. Imputation techniques are used in handling null values in datasets. Considering 15 learning models ranging from statistical learning to machine learning to deep learning methods, the best-performing model for the selected datasets is obtained and the efficacy of data enrichment is evaluated. Based on this evaluation, our results find that the enrichment procedure enhances up to 48% of F1 measure in rare failure event detection and prediction of supervised prediction models. We also conduct empirical and ablation experiments on the datasets to derive dataset-specific novel insights. Finally, we investigate the interpretability aspect of models for rare event prediction, considering multiple methods.

View on arXiv PDF

Similar