AISPMay 9, 2022

On Designing Data Models for Energy Feature Stores

arXiv:2205.04267v2h-index: 15Has Code
Originality Synthesis-oriented
AI Analysis

This addresses data management challenges for developers of ML-based energy applications, but it is incremental as it builds on existing feature store concepts.

The paper tackles the lack of domain-specific data models for managing features in machine learning pipelines for energy applications, proposing a taxonomy and showing that richer data models improve model performance, with a benchmark of feature management solutions.

The digital transformation of the energy infrastructure enables new, data driven, applications often supported by machine learning models. However, domain specific data transformations, pre-processing and management in modern data driven pipelines is yet to be addressed. In this paper we perform a first time study on generic data models that are able to support designing feature management solutions that are the most important component in developing ML-based energy applications. We first propose a taxonomy for designing data models suitable for energy applications, explain how this model can support the design of features and their subsequent management by specialized feature stores. Using a short-term forecasting dataset, we show the benefits of designing richer data models and engineering the features on the performance of the resulting models. Finally, we benchmark three complementary feature management solutions, including an open-source feature store suitable for time series.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes