Learning Data-Driven Uncertainty Set Partitions for Robust and Adaptive Energy Forecasting with Missing Data
This work addresses operational forecasting challenges in energy systems, such as wind power, by providing robust models that adapt in real-time to missing data, though it is incremental in combining existing optimization and machine learning techniques.
The paper tackles the problem of short-term energy forecasting when input data is missing due to failures or disruptions, proposing adaptive models that handle missing data without relying on historical patterns. The result shows that these models perform similarly to imputation for very short missing periods and significantly outperform it for longer periods, approaching the performance of impractical retraining methods.
Short-term forecasting models typically assume the availability of input data (features) when they are deployed and in use. However, equipment failures, disruptions, cyberattacks, may lead to missing features when such models are used operationally, which could negatively affect forecast accuracy, and result in suboptimal operational decisions. In this paper, we use adaptive robust optimization and adversarial machine learning to develop forecasting models that seamlessly handle missing data operationally. We propose linear- and neural network-based forecasting models with parameters that adapt to available features, combining linear adaptation with a novel algorithm for learning data-driven uncertainty set partitions. The proposed adaptive models do not rely on identifying historical missing data patterns and are suitable for real-time operations under stringent time constraints. Extensive numerical experiments on short-term wind power forecasting considering horizons from 15 minutes to 4 hours ahead illustrate that our proposed adaptive models are on par with imputation when data are missing for very short periods (e.g., when only the latest measurement is missing) whereas they significantly outperform imputation when data are missing for longer periods. We further provide insights by showcasing how linear adaptation and data-driven partitions (even with a few subsets) approach the performance of the optimal, yet impractical, method of retraining for every possible realization of missing data.