StreamEnsemble: Predictive Queries over Spatiotemporal Streaming Data
This work provides a significant improvement in prediction accuracy for applications dealing with spatiotemporal streaming data, such as environmental monitoring or traffic prediction.
This paper addresses the challenge of predictive queries over spatiotemporal streaming data, where data distributions vary in space and time. The proposed StreamEnsemble dynamically selects and allocates machine learning models based on time series distributions, achieving a prediction error reduction of more than 10 times compared to traditional methods.
Predictive queries over spatiotemporal (ST) stream data pose significant data processing and analysis challenges. ST data streams involve a set of time series whose data distributions may vary in space and time, exhibiting multiple distinct patterns. In this context, assuming a single machine learning model would adequately handle such variations is likely to lead to failure. To address this challenge, we propose StreamEnsemble, a novel approach to predictive queries over ST data that dynamically selects and allocates Machine Learning models according to the underlying time series distributions and model characteristics. Our experimental evaluation reveals that this method markedly outperforms traditional ensemble methods and single model approaches in terms of accuracy and time, demonstrating a significant reduction in prediction error of more than 10 times compared to traditional approaches.