LGAIJan 26, 2024

SCANIA Component X Dataset: A Real-World Multivariate Time Series Dataset for Predictive Maintenance

arXiv:2401.15199v215 citationsSci Data
Originality Synthesis-oriented
AI Analysis

This provides researchers with a standard benchmark for predictive maintenance, though it is incremental as it focuses on releasing a new dataset rather than advancing methods.

The paper tackles the scarcity of real-world multivariate time series datasets for predictive maintenance by introducing a dataset from SCANIA trucks' Component X, including operational data, repair records, and specifications, to enable applications like classification and anomaly detection.

Predicting failures and maintenance time in predictive maintenance is challenging due to the scarcity of comprehensive real-world datasets, and among those available, few are of time series format. This paper introduces a real-world, multivariate time series dataset collected exclusively from a single anonymized engine component (Component X) across a fleet of SCANIA trucks. The dataset includes operational data, repair records, and specifications related to Component X, while maintaining confidentiality through anonymization. It is well-suited for a range of machine learning applications, including classification, regression, survival analysis, and anomaly detection, particularly in predictive maintenance scenarios. The dataset's large population size, diverse features (in the form of histograms and numerical counters), and temporal information make it a unique resource in the field. The objective of releasing this dataset is to give a broad range of researchers the possibility of working with real-world data from an internationally well-known company and introduce a standard benchmark to the predictive maintenance field, fostering reproducible research.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes