LGAIIVMar 27, 2024

Computationally and Memory-Efficient Robust Predictive Analytics Using Big Data

arXiv:2403.19721v15 citationsh-index: 6CAI
Originality Synthesis-oriented
AI Analysis

This work addresses efficient predictive analytics for big data applications, such as monitoring physical systems, but is incremental as it combines existing methods.

The study tackled challenges of data uncertainties and storage limitations in big data by using Robust Principal Component Analysis for noise reduction and Optimal Sensor Placement for compression, combined with LSTM networks for predictive modeling, achieving accelerated training on real thermal imaging data.

In the current data-intensive era, big data has become a significant asset for Artificial Intelligence (AI), serving as a foundation for developing data-driven models and providing insight into various unknown fields. This study navigates through the challenges of data uncertainties, storage limitations, and predictive data-driven modeling using big data. We utilize Robust Principal Component Analysis (RPCA) for effective noise reduction and outlier elimination, and Optimal Sensor Placement (OSP) for efficient data compression and storage. The proposed OSP technique enables data compression without substantial information loss while simultaneously reducing storage needs. While RPCA offers an enhanced alternative to traditional Principal Component Analysis (PCA) for high-dimensional data management, the scope of this work extends its utilization, focusing on robust, data-driven modeling applicable to huge data sets in real-time. For that purpose, Long Short-Term Memory (LSTM) networks, a type of recurrent neural network, are applied to model and predict data based on a low-dimensional subset obtained from OSP, leading to a crucial acceleration of the training phase. LSTMs are feasible for capturing long-term dependencies in time series data, making them particularly suited for predicting the future states of physical systems on historical data. All the presented algorithms are not only theorized but also simulated and validated using real thermal imaging data mapping a ship's engine.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes