SYAILGNEAug 14, 2024

Real-world validation of safe reinforcement learning, model predictive control and decision tree-based home energy management systems

arXiv:2408.07435v213 citationsh-index: 35
Originality Synthesis-oriented
AI Analysis

This work addresses the need for real-world validation of energy management systems for residential users, but it is incremental as it primarily tests existing methods in a new setting without major breakthroughs.

The paper tackled the problem of validating machine learning-based home energy management systems in real-world settings, comparing reinforcement learning with a safety layer, a decision tree method, and model predictive control against benchmarks. The results showed that simple rules, decision tree, and model predictive control achieved similar costs with only a 0.6% difference, while reinforcement learning had a 25.5% higher cost during training, with the decision tree method being safest by exceeding grid limits by only 27.1 Wh compared to 593.9 Wh for reinforcement learning.

Recent advancements in machine learning based energy management approaches, specifically reinforcement learning with a safety layer (OptLayerPolicy) and a metaheuristic algorithm generating a decision tree control policy (TreeC), have shown promise. However, their effectiveness has only been demonstrated in computer simulations. This paper presents the real-world validation of these methods, comparing against model predictive control and simple rule-based control benchmark. The experiments were conducted on the electrical installation of 4 reproductions of residential houses, which all have their own battery, photovoltaic and dynamic load system emulating a non-controllable electrical load and a controllable electric vehicle charger. The results show that the simple rules, TreeC, and model predictive control-based methods achieved similar costs, with a difference of only 0.6%. The reinforcement learning based method, still in its training phase, obtained a cost 25.5\% higher to the other methods. Additional simulations show that the costs can be further reduced by using a more representative training dataset for TreeC and addressing errors in the model predictive control implementation caused by its reliance on accurate data from various sources. The OptLayerPolicy safety layer allows safe online training of a reinforcement learning agent in the real-world, given an accurate constraint function formulation. The proposed safety layer method remains error-prone, nonetheless, it is found beneficial for all investigated methods. The TreeC method, which does require building a realistic simulation for training, exhibits the safest operational performance, exceeding the grid limit by only 27.1 Wh compared to 593.9 Wh for reinforcement learning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes