Unified machine learning tasks and datasets for enhancing renewable energy
This work addresses the need for standardized datasets to facilitate the development of ML models for enhancing renewable energy and mitigating climate change, though it is incremental as it focuses on dataset creation rather than novel methods.
The authors tackled the lack of unified datasets for testing multi-tasking ML models in renewable energy by introducing ETT-17, a collection of 17 datasets from six domains with out-of-distribution validation and testing data, and provided performance benchmarks.
Multi-tasking machine learning (ML) models exhibit prediction abilities in domains with little to no training data available (few-shot and zero-shot learning). Over-parameterized ML models are further capable of zero-loss training and near-optimal generalization performance. An open research question is, how these novel paradigms contribute to solving tasks related to enhancing the renewable energy transition and mitigating climate change. A collection of unified ML tasks and datasets from this domain can largely facilitate the development and empirical testing of such models, but is currently missing. Here, we introduce the ETT-17 (Energy Transition Tasks-17), a collection of 17 datasets from six different application domains related to enhancing renewable energy, including out-of-distribution validation and testing data. We unify all tasks and datasets, such that they can be solved using a single multi-tasking ML model. We further analyse the dimensions of each dataset; investigate what they require for designing over-parameterized models; introduce a set of dataset scores that describe important properties of each task and dataset; and provide performance benchmarks.