Open-Source Ground-based Sky Image Datasets for Very Short-term Solar Forecasting, Cloud Analysis and Modeling: A Comprehensive Survey
This work addresses a data bottleneck for researchers in solar forecasting and related fields, offering a curated resource to improve machine learning models, though it is incremental as it compiles existing datasets rather than introducing new methods.
The authors tackled the lack of diverse sky image datasets for solar forecasting by surveying 72 open-source ground-based sky image datasets, developing a multi-criteria ranking system to evaluate them, and providing insights for their use in applications like cloud analysis and modeling.
Sky-image-based solar forecasting using deep learning has been recognized as a promising approach in reducing the uncertainty in solar power generation. However, one of the biggest challenges is the lack of massive and diversified sky image samples. In this study, we present a comprehensive survey of open-source ground-based sky image datasets for very short-term solar forecasting (i.e., forecasting horizon less than 30 minutes), as well as related research areas which can potentially help improve solar forecasting methods, including cloud segmentation, cloud classification and cloud motion prediction. We first identify 72 open-source sky image datasets that satisfy the needs of machine/deep learning. Then a database of information about various aspects of the identified datasets is constructed. To evaluate each surveyed datasets, we further develop a multi-criteria ranking system based on 8 dimensions of the datasets which could have important impacts on usage of the data. Finally, we provide insights on the usage of these datasets for different applications. We hope this paper can provide an overview for researchers who are looking for datasets for very short-term solar forecasting and related areas.