LGJan 4, 2023

Augmenting data-driven models for energy systems through feature engineering: A Python framework for feature engineering

arXiv:2301.01720v12 citationsh-index: 1
AI Analysis

This work provides a tool for researchers and practitioners in energy systems to enhance data-driven models, but it is incremental as it builds on existing scikit-learn methods.

The authors tackled the problem of improving data quality for machine learning models in energy systems by developing a Python framework for feature engineering, and demonstrated its effectiveness in an energy demand prediction case study, showing improved prediction accuracy.

Data-driven modeling is an approach in energy systems modeling that has been gaining popularity. In data-driven modeling, machine learning methods such as linear regression, neural networks or decision-tree based methods are being applied. While these methods do not require domain knowledge, they are sensitive to data quality. Therefore, improving data quality in a dataset is beneficial for creating machine learning-based models. The improvement of data quality can be implemented through preprocessing methods. A selected type of preprocessing is feature engineering, which focuses on evaluating and improving the quality of certain features inside the dataset. Feature engineering methods include methods such as feature creation, feature expansion, or feature selection. In this work, a Python framework containing different feature engineering methods is presented. This framework contains different methods for feature creation, expansion and selection; in addition, methods for transforming or filtering data are implemented. The implementation of the framework is based on the Python library scikit-learn. The framework is demonstrated on a case study of a use case from energy demand prediction. A data-driven model is created including selected feature engineering methods. The results show an improvement in prediction accuracy through the engineered features.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes